Your Agent Picks the Wrong Tool Because You Wrote a Bad Description — And No Platform Cares

You hooked your AI agent up to a dozen tools — Slack, GitHub, Jira, a database — and watched it confidently fire a Jira comment where a Slack message should go. Then it charged you for the privilege. Classic Tuesday.

Your instinct says "get a better model." But the model isn't reasoning badly. It's reading the only information it has about each tool: a description field — a few lines of plain text — that some developer wrote at 2 AM during a hackathon. That description is a prompt. You just didn't know you were writing one.

April 2026 delivered an agent platform blitz. On April 8, Anthropic launched Claude Managed Agents — a cloud service handling infrastructure, state management, and tool orchestration at $0.08 per session-hour. On April 15, OpenAI updated its Agents SDK with sandbox environments and guardrails. Then at Cloud Next (April 22–24), Google unveiled the Gemini Enterprise Agent Platform with a headline feature: Agent Optimizer — an algorithm that auto-tunes agent instructions by clustering real-world failures.

Three platforms in three weeks, each promising to make your agents smarter. Here's the catch none of them mentioned: all three optimize the system prompt. None of them touch tool descriptions.

According to Google's own docs, the Agent Optimizer algorithm operates exclusively on system instructions. The description field in every tool schema — the text the model actually reads to decide which tool to call — sits in a blind spot. Anthropic's Managed Agents inherit whatever MCP descriptions you feed them. OpenAI's SDK passes through your function schemas as-is. The optimization stops at the front door.

Here's the mechanism. When an agent invokes tools, the LLM receives a JSON schema for every registered tool. Each schema includes a plain-text description field. The model reads all of them on every call and picks the best match. MCP, OpenAI function calling, Google's ADK — same pattern. This is prompt engineering in disguise, and no platform validates these prompts for you.

The quality of those prompts is grim. A March 2026 benchmark found that over 97% of MCP server descriptions contain at least one quality issue — unclear purpose statements, missing edge cases, ambiguous parameter semantics. We've covered the downstream effects before: tool sprawl tanks accuracy, and the teams that audit aggressively see immediate gains. But the root cause persists. Nobody reviews description text with the same rigor they review code.

Meanwhile, those descriptions eat tokens whether the tool fires or not. The GitHub MCP server alone (93 tools) injects ~55,000 tokens just for schemas. Stack GitHub, Slack, and Sentry together: 143,000 tokens. That's 72% of a 200K context window consumed before the agent does anything useful. At 100 requests a day, that's $510 a month in pure schema overhead. You're not paying for intelligence. You're paying for the model to read bad documentation on every call.

And no registry fixes this. According to TrueFoundry's April 2026 analysis, the official MCP Registry has "no built-in curation, ratings, or governance features." Smithery offers no reliability evaluation. MCP Market provides "no assurance of quality or security." Over 10,000 MCP servers in the wild, 97 million monthly SDK downloads, and not a single marketplace scores whether a tool's description actually matches what the tool does.

Google, Anthropic, and OpenAI each shipped agent platforms that assume the tool layer is someone else's problem. Google will even optimize your system prompt for you — but the system prompt isn't where tool selection happens. The description field is. And right now, that field is a developer's 2 AM hackathon prose, copy-pasted across a thousand MCP forks, read by every model on every call, and reviewed by absolutely no one.

So before you upgrade your model, swap your provider, or connect your 51st integration — audit the descriptions you already ship. They are prompts you didn't know you were writing, and they control every tool decision your agent makes.

The next differentiator in the agent tool ecosystem won't be who has the most integrations. It'll be who labels them properly. The first registry that enforces description quality becomes the npm-with-TypeScript of the agent world — and right now, that registry doesn't exist.

Your Agent Picks the Wrong Tool Because You Wrote a Bad Description — And No Platform Cares

Keep reading

Build Your First MCP Server in Python: 40 Lines From Copy-Paste Human to AI That Sees Your Data

MCP's 2026 Roadmap Has Four Priorities. Error Handling Isn't One of Them

MCP Tool Overload: Every Server You Add Makes Your Agent Dumber

Three Agent SDKs Walk Into Production. Nobody Walks Out.