You connected a dozen MCP servers to your AI agent. GitHub, Slack, Linear, Postgres, S3, web search — the whole buffet. Your agent can theoretically touch your entire stack. You feel powerful. The agent does not.

It started fumbling tasks it used to nail. Picking the wrong tool. Hallucinating parameters that don't exist. Forgetting context you literally just typed. You didn't break anything — you just fed it too many menus to read before it could start cooking.

The Math Nobody Warned You About

On April 14, Cloudflare published an Enterprise MCP Reference Architecture that put actual numbers on the problem. MCP (Model Context Protocol) is a universal plug standard for AI tools — like USB, but for connecting agents to external services. Each MCP tool ships a schema telling the model what it does and what parameters it needs. Every single turn, the model reads all of them.

As we broke down in yesterday's Tool-Calling Is Dead, Cloudflare's own portal burned ~9,400 tokens on tool descriptions alone — before the agent touched your actual problem. GitHub's MCP server (94 tools) ate ~42,000 tokens. The numbers bear repeating only because nothing changed between then and now. People just kept plugging in servers.

A benchmark from March 6 had already documented the accuracy collapse: tool selection dropped from ~95% with 4 focused tools to ~71% with 46 tools. Six weeks later, Cloudflare confirmed the same problem at enterprise scale. The protocol didn't change. The server count did.

Everyone's Fixing It, Nobody Agrees How

Cloudflare shipped Code Mode on April 16 — nuking the tool phone book and replacing it with a typed API. Two entry points instead of 2,500+. Tokens dropped 99.9%. Brilliant. Also locked to Cloudflare Workers. They solved the open standard problem with a proprietary solution. Classic.

Atlassian took the compression route. Their open-source mcp-compressor, released March 29, squeezes GitHub MCP's 94 tools from 17,600 tokens to 500 at max compression (97% reduction). Think minifying your API docs until even you can't read them. The model somehow still can — but the tradeoff is real. Atlassian's own benchmarks show max compression drops parameter constraint fidelity: complex tools with nested object schemas lose the validation hints models need for correct invocations. Their docs recommend medium compression (80% reduction, ~3,500 tokens) for production and reserve max for "exploration only." The honest version: you're trading accuracy for headroom and hoping the model fills in the gaps.

Anthropic went a different route entirely. On April 8, they launched Managed Agents at $0.08/hour — specialized sub-agents with narrow 5–10 tool kits instead of one generalist drowning in 50. Each sub-agent loads only its own tools per turn, cutting per-agent overhead roughly 85%. The fix for too many tools? More agents with fewer tools each. Recursion as a service.

And then there's the teams that skipped optimization entirely and just started deleting things. On March 12, GitHub Copilot's engineering team shared results from cutting their tool count from 40 to 13 — 2–5 point benchmark improvement, 400ms latency drop. In February, Block rebuilt its Linear MCP server three times, shrinking from 30+ tools down to 2. On April 3, Phil Schmid (Hugging Face) distilled the pattern into a single rule: "Curate ruthlessly. 5 to 15 tools per server. One server, one job." No compression algorithm. No discovery layer. Just discipline.

The Real Problem Is the Protocol

Here's what none of these solutions fix: every single one is proprietary, platform-specific, or a workaround for a hole in MCP itself.

Cloudflare Code Mode runs on Workers. Managed Agents run with Claude. Atlassian's compressor is the most portable option — and it's still duct tape on a protocol that shipped without a table of contents.

Anthropic pitched MCP as the universal standard. The one connector to rule them all. Instead, we're building vendor-specific discovery layers on top of the universal standard to make it actually work at scale.

We've watched this exact movie before. CORBA in the '90s — a "universal" object protocol that spawned an entire industry of vendor-specific bridges just to make it usable. The Interface Repository promised dynamic discovery; in practice, every ORB vendor shipped their own. SOAP in the 2000s — the enterprise "standard" everyone quietly routed around with REST because WSDL files grew into unreadable monstrosities. JavaScript modules — AMD, CommonJS, UMD, a full decade of fragmentation before ES modules arrived. The pattern never changes: open standard ships incomplete, vendors fill the gaps with proprietary layers, ecosystem fragments until someone fixes the standard or kills it.

MCP is in the vendor gap-filling phase. Cloudflare, Anthropic, Atlassian, and a dozen smaller players — each building their own answer to the same missing feature: dynamic tool discovery. The protocol needs to handle this natively. It doesn't. So we get six competing solutions and call it an ecosystem.

The optimistic read: competition drives innovation, the best approach wins, the standard absorbs it. The realistic read — the one I'd bet on — is that major model providers bake their preferred discovery into default agent frameworks, and "universal" quietly starts meaning "works with Claude" or "works with GPT" but not both. USB-C with vendor charging protocols, all over again.

What You Actually Do Today

Audit your MCP connections. Remove servers your agent hasn't called in a week. Group remaining tools by task domain. Measure token usage before and after — you'll be surprised how much headroom you recover.

MCP doesn't need more servers. It needs a package manager moment — dynamic discovery and lazy loading that treats tools like imports, not global variables crammed into every prompt. Until then, less is literally more. And the agents that perform best won't be the ones with the most tools — they'll be the ones that learned to say no.