Agent Handoff Patterns: Specialization at Scale

CallMissed
·5 min readGuide

A handoff is the cleanest multi-agent primitive in 2026 — one agent transfers control to another, carrying conversation context, and the new agent owns the next response. The pattern shows up across frameworks (it's the core abstraction in the OpenAI Agents SDK, and it's expressible in LangGraph as a conditional edge plus state copy). Done well, handoffs let you scale specialization without scaling complexity. Done badly, they introduce loops, latency, and lost context.

What a handoff actually is

In the OpenAI Agents SDK model, a handoff is a one-way transfer of execution. When agent A hands off to agent B, B receives the conversation history and takes over the conversation. A is out of the loop until B (or someone B hands off to) finishes.

Mechanically it's a special tool call — the model emits transfer_to_specialist, the SDK intercepts it, swaps the active agent and conversation context, and resumes generation in the new agent.

This is intentionally different from the "agents-as-tools" pattern, where one agent calls another like a function and continues running. Handoffs delegate ownership; agents-as-tools borrow capability. Both are useful for different shapes of problem.

The triage + specialist pattern

The bread-and-butter handoff layout:

Code
User → Triage Agent → routes to:
                       ├─ Billing Specialist
                       ├─ Tech Support Specialist
                       └─ Account Specialist

Triage owns intent classification and a small set of "deflection" responses (FAQs, how to reach a human). Each specialist owns one domain with its own tool set, system prompt, and model choice. The triage agent's tool set is just the handoffs.

Why this works:

  • Each specialist sees a smaller tool set → tighter routing, less hallucination
  • Each specialist's system prompt is focused → better behavior in its domain
  • Triage stays small and cheap → fast classification, low first-response latency
  • Adding a new domain is a contained change → register a new specialist, add the handoff to triage
  • Returning control

    A common question: when the specialist is done, does control return to triage? The answer in the OpenAI SDK model is "no, by default." Handoffs are one-way. The specialist owns the conversation until it hands off elsewhere.

    This is correct most of the time but has a sharp edge: if the user pivots topics ("oh actually I have a billing question now"), the current specialist needs to detect intent change and hand off back to triage (or directly to billing). Two patterns work:

  • Intent-aware specialists. Each specialist has a "this is out of my domain" handoff back to triage in its tool set. The system prompt instructs them to use it.
  • Periodic re-triage. Every N turns, or on confidence drop, run triage on the latest message and switch agents if intent changed.
  • Skip both and your account specialist will start fielding tech-support questions poorly.

    Tool name overrides and input filters

    The SDK exposes useful customization on each handoff:

  • Tool name override. Default is transfer_to_<agent_name>. Override to something more natural-language ("escalate_to_human") when the handoff is user-facing.
  • on_handoff callback. Fires when the handoff is invoked — useful for logging, telemetry, or capturing the transition point in your observability backend.
  • input_filter. Decide what conversation history the receiving agent sees. By default the full transcript transfers; for privacy or context-window reasons you may want to filter or summarize.
  • is_enabled. Boolean or function — dynamically enable/disable a handoff at runtime, useful when the specialist is down or the user is on a plan that doesn't include that capability.
  • Anti-patterns and how to avoid them

    Handoff loops

    A → B → A → B forever. Common causes:

  • Every specialist has a "back to triage" handoff and triage routes back to the same specialist
  • Two specialists both think the same intent is theirs
  • What helps:

  • Cap handoffs per session (e.g., 5)
  • Give each specialist a clear "I cannot handle this" final response in addition to the handoff option
  • Use the on_handoff callback to detect cycles and short-circuit
  • Lost context

    The receiving agent doesn't know critical state from earlier in the conversation. Symptoms: the specialist asks the user to repeat information they already provided.

    What helps:

  • Use input_filter to summarize and inject critical state into the receiving agent's context
  • Maintain durable session state in a shared store (Postgres, Redis) that all agents read; don't rely solely on the transcript
  • Log what each handoff carried — most context bugs are visible in tracing
  • Too many specialists

    OpenAI's practical guide suggests handoff lists become unwieldy past 8–10 agent types. [Inference] At that scale, the triage agent is itself struggling to choose, and you should consider hierarchical handoffs (triage → category triage → specialist) rather than one flat list.

    Handoffs vs agents-as-tools

    When to pick which:

  • Handoff — when the new agent should own the next response. Customer support routing, escalations, "talk to a different expert."
  • Agent-as-tool — when the calling agent wants to use the other agent and continue. "Summarize this document, then write a reply" — the summarizer is a tool, not a handoff target.
  • Mixing both in one system is fine and common. The naming matters more than the implementation: "transfer to billing" is a handoff, "summarize document" is a tool call.

    Beyond the SDK

    LangGraph expresses handoffs as conditional edges with state copy — same semantics, different syntax. CrewAI and AutoGen each have their own handoff primitives. The pattern is portable; the implementations differ.

    [Speculation] In 2026, the conceptual model is more stable than the framework choice. If you build clean handoff boundaries, swapping frameworks later is mostly mechanical.

    Frequently Asked Questions

    Can a handoff target hand back to the original agent?
    Yes — register the original as a handoff target on the receiving agent. Cap total handoffs per session to avoid loops, and use observability to spot cycles early.
    How is a handoff different from calling an agent as a tool?
    Handoffs transfer ownership of the conversation; the original agent steps out. Agent-as-tool keeps the original agent in control — it calls the other agent, gets a result, continues. Pick handoff when the next response should come from the new agent; pick agent-as-tool when you just need a sub-result.
    What happens to the system prompt during a handoff?
    The receiving agent's own system prompt becomes active. The conversation transcript transfers; the original agent's system prompt does not. This is the source of most "specialist behaves differently than expected" bugs — write each specialist's prompt assuming it's the only one the user is talking to.

    Related Posts