Agent Handoff Patterns: Specialization at Scale
A handoff is the cleanest multi-agent primitive in 2026 — one agent transfers control to another, carrying conversation context, and the new agent owns the next response. The pattern shows up across frameworks (it's the core abstraction in the OpenAI Agents SDK, and it's expressible in LangGraph as a conditional edge plus state copy). Done well, handoffs let you scale specialization without scaling complexity. Done badly, they introduce loops, latency, and lost context.
What a handoff actually is
In the OpenAI Agents SDK model, a handoff is a one-way transfer of execution. When agent A hands off to agent B, B receives the conversation history and takes over the conversation. A is out of the loop until B (or someone B hands off to) finishes.
Mechanically it's a special tool call — the model emits transfer_to_specialist, the SDK intercepts it, swaps the active agent and conversation context, and resumes generation in the new agent.
This is intentionally different from the "agents-as-tools" pattern, where one agent calls another like a function and continues running. Handoffs delegate ownership; agents-as-tools borrow capability. Both are useful for different shapes of problem.
The triage + specialist pattern
The bread-and-butter handoff layout:
User → Triage Agent → routes to:
├─ Billing Specialist
├─ Tech Support Specialist
└─ Account SpecialistTriage owns intent classification and a small set of "deflection" responses (FAQs, how to reach a human). Each specialist owns one domain with its own tool set, system prompt, and model choice. The triage agent's tool set is just the handoffs.
Why this works:
Returning control
A common question: when the specialist is done, does control return to triage? The answer in the OpenAI SDK model is "no, by default." Handoffs are one-way. The specialist owns the conversation until it hands off elsewhere.
This is correct most of the time but has a sharp edge: if the user pivots topics ("oh actually I have a billing question now"), the current specialist needs to detect intent change and hand off back to triage (or directly to billing). Two patterns work:
Skip both and your account specialist will start fielding tech-support questions poorly.
Tool name overrides and input filters
The SDK exposes useful customization on each handoff:
transfer_to_<agent_name>. Override to something more natural-language ("escalate_to_human") when the handoff is user-facing.on_handoff callback. Fires when the handoff is invoked — useful for logging, telemetry, or capturing the transition point in your observability backend.input_filter. Decide what conversation history the receiving agent sees. By default the full transcript transfers; for privacy or context-window reasons you may want to filter or summarize.is_enabled. Boolean or function — dynamically enable/disable a handoff at runtime, useful when the specialist is down or the user is on a plan that doesn't include that capability.Anti-patterns and how to avoid them
Handoff loops
A → B → A → B forever. Common causes:
What helps:
on_handoff callback to detect cycles and short-circuitLost context
The receiving agent doesn't know critical state from earlier in the conversation. Symptoms: the specialist asks the user to repeat information they already provided.
What helps:
input_filter to summarize and inject critical state into the receiving agent's contextToo many specialists
OpenAI's practical guide suggests handoff lists become unwieldy past 8–10 agent types. [Inference] At that scale, the triage agent is itself struggling to choose, and you should consider hierarchical handoffs (triage → category triage → specialist) rather than one flat list.
Handoffs vs agents-as-tools
When to pick which:
Mixing both in one system is fine and common. The naming matters more than the implementation: "transfer to billing" is a handoff, "summarize document" is a tool call.
Beyond the SDK
LangGraph expresses handoffs as conditional edges with state copy — same semantics, different syntax. CrewAI and AutoGen each have their own handoff primitives. The pattern is portable; the implementations differ.
[Speculation] In 2026, the conceptual model is more stable than the framework choice. If you build clean handoff boundaries, swapping frameworks later is mostly mechanical.