You split your monolithic agent — a program that acts on your behalf — into a research sub-agent and a code sub-agent, exactly like the SDK docs suggested. Delegation! Division of labor! Modern management theory, but for AI. What could go wrong.
In production, quite a lot. The code sub-agent cheerfully ignores constraints the research sub-agent discovered. The parent agent shrugs. You stare at logs wondering where half the context — all the information the AI needs to do its job — vanished between point A and point B. Welcome to multi-agent telephone.
Three Platforms, Three Ways to Lose Your Data
Between April 9 and 17, 2026, the three biggest AI platforms all shipped or updated sub-agent delegation — letting one AI hand off work to another AI — as a first-class feature:
- April 9: Anthropic launched Managed Agents in public beta. Each sub-agent gets a fresh session — a clean conversation slate — plus an instruction string.
- April 15: OpenAI updated its Agents SDK with sandboxed sub-agent routing. Default behavior: pass the entire conversation history to the next agent.
- April 17: Google ADK (Agent Development Kit), which first shipped multi-agent support in late March, updated its multi-agent docs and session state model — basically a shared whiteboard where agents scribble notes to each other. Their own docs include this gem: "the Root Agent is effectively out of the loop."
Three platforms. Three incompatible mechanisms. Zero documentation on what actually gets lost at the handoff boundary.
The Telephone Game, Quantified
Here's how each platform passes context when Agent A delegates to Agent B:
# OpenAI: passes a filtered message list via HandoffInputData
class HandoffInputData:
input_history: list # full chat history, filterable
pre_handoff_items: list
new_items: list
# Default: everything passes through.
# But input guardrails (safety filters) apply ONLY
# to the first agent. The rest run unguarded.
# Anthropic: starts a brand-new session per agent
# POST /v1/sessions → fresh context, clean slate
# "brains can pass hands to one another"
# ...but the new brain starts with selective amnesia
# Google ADK: shared state dictionary
session.state["research_results"] = findings
# Other agent reads the key. If it exists.
# Parallel execution? Race conditions (two agents
# writing to the same key simultaneously) are on you.
The decay isn't theoretical. A February 2026 UC Berkeley study of 1,600+ traces across seven agent frameworks found failure rates up to 86.7%. XTrace analysis showed a research agent producing 3,000 useful tokens — word-chunks the AI processes — buried in 40,000 tokens of total context. That's a 93% noise ratio at the handoff. The study broke failures into three categories: context loss (information simply vanishing between agents), context corruption (information arriving but semantically distorted), and context dilution (useful information drowned in noise). A March 2026 Google DeepMind paper on multi-agent coordination measured 39–70% reasoning degradation at delegation boundaries.
As BriefHQ put it on March 11: "What disappeared along the way was not raw information. What disappeared was decision context."
The Price of Fixing It
Your options aren't great:
- Serialize full context into the delegation prompt — burns tokens (at ~$5–25 per million for frontier models) and eats your context window alive
- Shared memory stores — adds vendor lock-in and another point of failure
- Skip delegation entirely — back to single-agent monoliths that choke on complex workflows
No platform provides a built-in mechanism for a parent agent to verify what its child actually received versus what the parent sent. You're managing a team that can't CC you on emails.
Before You Decompose
Before you split your agent into a multi-agent workflow, run one dead-simple test: inject a specific constraint at the top of the chain and check whether the bottom agent honors it. Something like "never use pandas" or "all outputs must be in metric units." If the last agent violates it — congratulations, you found your context leak.
Take it a step further. Log the token count at each handoff boundary. If Agent A sends 3,000 tokens of research and Agent B's effective context only contains 200 tokens of it, you know exactly where the drain is. No fancy tracing framework required — a print statement at each delegation point tells the whole story. Do this before deploying to production. Do this before writing a single line of orchestration code.
Every platform pitches multi-agent delegation as "managing a team." But the team members can't read each other's notes, the meeting minutes get shorter at every level of the org chart, and nobody built a mechanism to detect the information loss. An October 2025 Gartner report predicts over 40% of agentic AI projects will be canceled by 2027. Looking at the handoff architectures these three platforms shipped in April 2026, that number feels optimistic.


