The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
knitr::opts_chunk$set(
collapse = TRUE, comment = "#>",
eval = identical(tolower(Sys.getenv("LLMR_RUN_VIGNETTES", "false")), "true") )
OpenAI-compatible (OpenAI, Groq, Together, x.ai,
DeepSeek)
Chat Completions accept a response_format
(e.g.,
{"type":"json_object"}
or a JSON-Schema payload).
Enforcement varies by provider but the interface is OpenAI-shaped.
See OpenAI
API overview, Groq API
(OpenAI-compatible), Together: OpenAI
compatibility, x.ai: OpenAI API
schema, DeepSeek:
OpenAI-compatible endpoint
Anthropic (Claude)
No global “JSON mode.” Instead, you define a tool with
an input_schema
(JSON Schema) and
force it via tool_choice
, so the model
must return a JSON object that validates the schema.
See Anthropic
Messages API: tools & input_schema
Google Gemini (REST)
Set responseMimeType = "application/json"
in
generationConfig
to request JSON. Some models also accept
responseSchema
for constrained JSON
(model-dependent).
See Gemini
documentation —
llm_parse_structured()
strips fences and extracts the
largest balanced {...}
or
[...]
before parsing.llm_parse_structured_col()
hoists fields (supports
dot/bracket paths and JSON Pointer) and keeps non-scalars as
list-columns.llm_validate_structured_col()
validates locally via
jsonvalidate (AJV).enable_structured_output()
flips the right provider
switch (OpenAI-compat response_format
, Anthropic
tool + input_schema
, Gemini
responseMimeType
/responseSchema
).All chunks use a tiny helper so your document knits even without API keys.
safe({
library(LLMR)
cfg <- llm_config(
provider = "openai", # try "groq" or "together" too
model = "gpt-4o-mini",
temperature = 0
)
# Flip JSON mode on (OpenAI-compat shape)
cfg_json <- enable_structured_output(cfg, schema = NULL)
res <- call_llm(cfg_json, 'Give me a JSON object {"ok": true, "n": 3}.')
parsed <- llm_parse_structured(res)
cat("Raw text:\n", as.character(res), "\n\n")
str(parsed)
})
#> Raw text:
#> {
#> "ok": true,
#> "n": 3
#> }
#>
#> List of 2
#> $ ok: logi TRUE
#> $ n : num 3
What could still fail? Proxies labeled
“OpenAI-compatible” sometimes accept response_format
but
don’t strictly enforce it; LLMR’s parser recovers from fences or
pre/post text.
Groq serves Qwen 2.5 Instruct models with OpenAI-compatible APIs.
Their Structured Outputs feature enforces JSON Schema
and (notably) expects all properties to be listed under
required
.
safe({
library(LLMR); library(dplyr)
# Schema: make every property required to satisfy Groq's stricter check
schema <- list(
type = "object",
additionalProperties = FALSE,
properties = list(
title = list(type = "string"),
year = list(type = "integer"),
tags = list(type = "array", items = list(type = "string"))
),
required = list("title","year","tags")
)
cfg <- llm_config(
provider = "groq",
model = "qwen-2.5-72b-instruct", # a Qwen Instruct model on Groq
temperature = 0
)
cfg_strict <- enable_structured_output(cfg, schema = schema, strict = TRUE)
df <- tibble(x = c("BERT paper", "Vision Transformers"))
out <- llm_fn_structured(
df,
prompt = "Return JSON about '{x}' with fields title, year, tags.",
.config = cfg_strict,
.schema = schema, # send schema to provider
.fields = c("title","year","tags"),
.validate_local = TRUE
)
out %>% select(structured_ok, structured_valid, title, year, tags) %>% print(n = Inf)
})
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
#> [2025-08-26 01:02:36.347837] LLMR Error: LLM API request failed.
#> HTTP status: 404
#> Reason: The model `qwen-2.5-72b-instruct` does not exist or you do not have access to it.
#> Tip: check model params for provider/API version.
#> [2025-08-26 01:02:36.346521] LLMR Error: LLM API request failed.
#> HTTP status: 404
#> Reason: The model `qwen-2.5-72b-instruct` does not exist or you do not have access to it.
#> Tip: check model params for provider/API version.
#> # A tibble: 2 × 5
#> structured_ok structured_valid title year tags
#> <lgl> <lgl> <chr> <chr> <chr>
#> 1 FALSE FALSE <NA> <NA> <NA>
#> 2 FALSE FALSE <NA> <NA> <NA>
If your key is set, you should see structured_ok = TRUE
,
structured_valid = TRUE
, plus parsed columns. (Tip: if
you see a 400 complaining about required
, add
all properties to required
, as
above.)
max_tokens
)safe({
library(LLMR)
schema <- list(
type="object",
properties=list(answer=list(type="string"), confidence=list(type="number")),
required=list("answer","confidence"),
additionalProperties=FALSE
)
cfg <- llm_config("anthropic","claude-3-7", temperature = 0)
cfg <- enable_structured_output(cfg, schema = schema, name = "llmr_schema")
res <- call_llm(cfg, c(
system = "Return only the tool result that matches the schema.",
user = "Answer: capital of Japan; include confidence in [0,1]."
))
parsed <- llm_parse_structured(res)
str(parsed)
})
#> Warning in call_llm.anthropic(cfg, c(system = "Return only the tool result that
#> matches the schema.", : Anthropic requires max_tokens; setting it at 2048.
#> ERROR: LLM API request failed.
#> HTTP status: 404
#> Reason: model: claude-3-7
#> Tip: check model params for provider/API version.
#> NULL
Anthropic requires
max_tokens
; LLMR warns and defaults if you omit it.
safe({
library(LLMR)
cfg <- llm_config(
"gemini", "gemini-2.0-flash",
response_mime_type = "application/json" # ask for JSON back
# Optionally: gemini_enable_response_schema = TRUE, response_schema = <your JSON Schema>
)
res <- call_llm(cfg, c(
system = "Reply as JSON only.",
user = "Produce fields name and score about 'MNIST'."
))
str(llm_parse_structured(res))
})
#> List of 1
#> $ :List of 2
#> ..$ name : chr "MNIST"
#> ..$ score: chr "99.6"
safe({
library(LLMR); library(tibble)
messy <- c(
'```json\n{"x": 1, "y": [1,2,3]}\n```',
'Sure! Here is JSON: {"x":"1","y":"oops"} trailing words',
'{"x":1, "y":[2,3,4]}'
)
tibble(response_text = messy) |>
llm_parse_structured_col(
fields = c(x = "x", y = "/y/0") # dot/bracket or JSON Pointer
) |>
print(n = Inf)
})
#> # A tibble: 3 × 5
#> response_text structured_ok structured_data x y
#> <chr> <lgl> <list> <dbl> <dbl>
#> 1 "```json\n{\"x\": 1, \"y\": [1,2,3]… TRUE <named list> 1 1
#> 2 "Sure! Here is JSON: {\"x\":\"1\",\… TRUE <named list> 1 NA
#> 3 "{\"x\":1, \"y\":[2,3,4]}" TRUE <named list> 1 2
Why this helps Works when outputs arrive fenced,
with pre/post text, or when arrays sneak in. Non-scalars become
list-columns (set allow_list = FALSE
to force scalars
only).
enable_structured_output()
and run
llm_parse_structured()
+ local validation.input_schema
: https://docs.anthropic.com/en/api/messages#body-tool-choiceThese binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.