You Secured Your Agent's Tool Calls. Nobody Secured the Answers.

You did everything right. You vetted your MCP (Model Context Protocol — a universal plug standard for AI tools, like USB but for data) servers, locked down permissions, pinned schema versions so your AI agent — a program that uses tools on its own — only calls what you approve. Your agent infrastructure feels production-hardened. You sleep well.

You shouldn't.

Because every tool your agent calls sends a response back. And as of April 25, 2026, almost nobody in the industry validates what's inside that response before it lands in the agent's context window — the working memory where the AI model can't tell trusted instructions from garbage a tool just spit back.

Three Platforms, Same Blind Spot

Since early April, the three biggest AI companies shipped agent security features — all guarding the wrong door.

On April 8, Anthropic launched Managed Agents with scoped permissions and credential storage. It controls which tools the agent can call. What those tools answer? Not their problem.

On April 16, OpenAI updated its Agents SDK with automatic tracing — a logging system that records every tool call, handoff, and guardrail event. It observes responses. It doesn't sanitize them. That's like installing a security camera that watches someone walk in with a knife and writes it down.

On April 22, Google shipped Agent Gateway at Cloud Next with Model Armor, which actually sanitizes both tool calls and responses — screening for prompt injection, malicious URLs, and data leakage. Google, to its credit, is the only major platform that explicitly guards the response side. It's in preview.

Why This Matters: The Door Is Wide Open

MCP's specification defines inputSchema — a strict format for what you send to a tool. There is no outputSchema. Tool responses are arbitrary text or JSON that flows unfiltered into the model's reasoning. The spec literally doesn't have a field for "validate what comes back."

This creates three attack vectors that should keep you up at night:

Indirect prompt injection — a tool returns content with hidden instructions baked in. The PipeLab State of MCP Security 2026 report (published April 2026) documents a real case: an attacker crafted a malicious GitHub issue so that when an MCP server fetched it, the response instructed the agent to exfiltrate private repository contents. "The tool descriptions were clean. The poisoning sat in the data the tool returned."

Context flooding — a tool returns so much data it drowns the agent's working memory, pushing critical instructions out of the context window.

Data exfiltration chains — a poisoned response tells the agent to forward sensitive context to another tool. The Log-To-Leak research paper (published March 2026) demonstrated this across GPT-5, Claude Sonnet 4, and others — achieving a 100% attack success rate on GPT-5 connected to a PayPal MCP server, with 94.6% data leak accuracy.

Meanwhile, on April 16, OX Security disclosed 11 CVEs affecting roughly 200,000 MCP server instances. Anthropic's official response: sanitization is "the developer's responsibility." Even the OWASP MCP Top 10 (released April 2026) — the industry's first attempt at an MCP security framework — has no dedicated category for unvalidated tool responses. The gap is so normalized that the people writing the security standards haven't named it yet.

The Price of Fixing It

Adding response validation breaks the simplicity that made MCP successful in the first place. Tools would need output schemas. Agents would need a sanitization layer — something like Microsoft's Agent Governance Toolkit (open-sourced April 2), which includes an MCP security gateway with response inspection. Every call gains parsing overhead. The "just plug tools in" experience dies.

But the alternative is worse.

What This Means for You

Until response-side validation ships everywhere, every MCP server you connect is an unfiltered pipe into your agent's brain. All the security budget you spent on input gates protects the wrong end of the call. If you're running agents in production today, you need either Google's Model Armor (preview), Microsoft's AGT, or your own response sanitization middleware. "Trust the tool" is not a security policy.

You locked the front door. The back door doesn't have a lock. It doesn't even have a door.

The next major agent security incident won't come from a bad tool call. It'll come from a tool's answer.

You Secured Your Agent's Tool Calls. Nobody Secured the Answers.

Three Platforms, Same Blind Spot

Why This Matters: The Door Is Wide Open

The Price of Fixing It

What This Means for You

Keep reading

Google ADK 1.0: Your AI Tools Might Be Secret Agents Now

Every Text Your AI Agent Reads Is an Unsigned Command

Build Your First MCP Server in Python: 40 Lines From Copy-Paste Human to AI That Sees Your Data

Your Agent Picks the Wrong Tool Because You Wrote a Bad Description — And No Platform Cares