Structured Output vs Tool Use: Which When
By 2026 the "JSON parsing with regex" era is over. Both major model APIs offer constrained-decoding paths that produce schema-valid output, and tool use is mature enough that one or the other handles 90% of structured generation workloads. The remaining question is which to reach for — and the answer depends on whether you're returning data or taking action.
The two paths
Structured output / response_format. You pass a JSON Schema; the model returns text that conforms to it. OpenAI calls this Structured Outputs (with response_format set to a json_schema). Anthropic offers similar capabilities via the tool-call mechanism, with the SDK packaging it as a structured-output convenience.
Tool use / function calling. You define one or more tools as JSON Schemas; the model picks one and emits arguments that conform. The arguments are validated against the schema; you execute the tool; the result goes back into the loop.
Same constraint mechanism (constrained decoding); different mental model.
How the two providers handle each
OpenAI integrates structured outputs directly with their API. With response_format: { type: "json_schema" } and strict: true, the model is forced into schema-valid output. This is widely described as guaranteed schema compliance; the constrained decoder rejects invalid token continuations during generation.
Anthropic delivers structured output primarily through the tool-use pattern. You define a tool with the schema; the model "calls" the tool; you read the arguments. Per several public comparisons, Anthropic's SDK silently transforms your schema by removing constraints like minimum, maximum, minLength, pattern and moving them to the description field. The constrained decoder validates after generation and retries if validation fails. [Inference]
Practical implication: schemas with rich constraints (e.g., string pattern: "^[A-Z]{3}\\d{4}$") round-trip more cleanly through OpenAI's structured output than through Anthropic's tool-use path. Adding a Pydantic / Zod validator on your end as a safety net is reasonable on either provider.
When to pick structured output
Reach for response_format / json_schema when:
Examples: classification ("which intent does this fit?"), entity extraction, summarization with metadata, generating UI props.
When to pick tool use
Reach for tool calling when:
Examples: any agent that does more than one thing, RAG with multiple retrievers, function dispatch, anything that loops.
The blurry middle
There's a class of tasks where either works:
{fields, escalate: bool} or one tool call to extract_and_decideRules of thumb in 2026:
Latency considerations
Structured output is usually a single round-trip. Tool use is at least one more — the model emits a tool call, you execute, you send the result back, the model produces the final response. For latency-sensitive surfaces (chat completions seen by humans), this can matter.
A workaround for the tool-use latency tax: if your tool execution is fast and deterministic, return the result eagerly while the user is still seeing the model's first message. Some agent frameworks do this automatically; you can replicate the pattern in raw API code with parallel tool execution.
Reliability considerations
Both paths have edge cases:
In both cases, structured outputs alone don't replace evals. They make the parsing layer reliable; the reasoning layer still needs evidence.
Cost considerations
[Inference] Tool use generally costs more per round-trip because of the extra system-prompt overhead for tool descriptions and the extra round-trips. Structured output is closer to a vanilla completion in token consumption. If a task can be done as a single structured output call, that's typically the cheaper path.
A pragmatic split
Most agent systems in 2026 use both:
Don't pick one for everything. They are complementary, not competitive.
