← View All Insights
MCPMetadata

97% of MCP Tool Descriptions Are Broken

2026-04-10·4 min read

An academic analysis of 856 tools across 103 MCP servers found that 97.1% of tool descriptions contain at least one "smell" — unstated limitations, missing usage guidelines, opaque parameters. That's not a fringe problem. That's the baseline.

In The Six Levels of MCP Servers, I described what each maturity level looks like. Here's what the jump from Level 1 to Level 4 actually takes.

A tool description is not a sentence

The MCP specification's own best-practice proposal considers this a good description: "Read the contents of multiple files simultaneously. More efficient than reading files individually." That's the ceiling the community is aiming for.

After building 52 tools across seven production servers, I arrived at a different standard. A tool description is an operational manual — structured into blocks, each added because the agent failed without it.

The eight blocks

Block Purpose Without it...
RETURNS Agent knows which fields come back Can't determine if this tool has the data it needs
WHEN TO USE Agent knows when this tool fits Picks wrong tool or misses this one
WHEN NOT TO USE Prevents wrong tool selection Tries this tool for queries that belong elsewhere
QUERY STRATEGY Teaches summary-first, then drill down Fetches full records when a summary would suffice
INTERPRETATION Cross-field rules, type-to-field mappings Returns raw numbers without conclusions
RELATED TOOLS Agent chains queries via join keys Stops after first tool call
FEEDBACK Agent reports friction Issues go undetected, descriptions never improve
ALERTS Agent surfaces server-generated warnings Ignores domain-specific warnings in the response

These blocks are not theoretical. Each was added in response to a specific failure mode observed in production. The agent picked the wrong tool — add WHEN NOT TO USE. The agent fetched 2,000 records instead of a summary — add QUERY STRATEGY. The agent ignored that a status code meant something entirely different for a different record type — add INTERPRETATION.

What actually works as a metadata channel

Not everything you write reaches the agent. After testing across Claude Desktop, Claude Code, and Cursor, a clear hierarchy emerged:

Tier 1 — Always works (~95% of the value). Tool descriptions and input/output schema .describe() annotations. The agent reads these on every call. This is where all domain knowledge must live.

Tier 2 — Works sometimes. Server instructions — cross-tool behavioral rules injected at session start. Some clients inject them, others don't. Useful for global rules, not a substitute for per-tool descriptions.

Tier 3 — Does not work in practice. MCP Resources and meta-tools for reference lookups. I built both, deployed them, tested them. Agents never requested them across any tested client. Both were removed.

The key insight: everything the agent needs must live in the tool description and input/output schema. That's the only reliable delivery channel — for now.

Before and after

The difference in practice. A building profile tool — same API, same data:

Level 1: "Look up building information by postcode and house number." Two untyped parameters. No field documentation. The agent guesses what an energy label means, whether the building year is reliable, and which postcode format the API accepts.

Level 4: Structured WHEN TO USE / WHEN NOT TO USE blocks. A QUERY STRATEGY that warns the agent not to trust smart meter registration addresses as physical building addresses. An INTERPRETATION block that explains three different energy certification standards — NTA 8800 returns kWh/m2, Nader Voorschrift returns MJ total building energy. Without that block, the agent compares them as if they're the same unit and gives confidently wrong advice. Input schema with regex validation. Output fields with .describe() annotations documenting null patterns, cross-tool join keys, and which surface area field to use for which benchmark.

Same tool. Different category. The Level 1 version exposes an API. The Level 4 version teaches the agent how a domain expert thinks about this data.

An open-source extract of this tool — with the metadata patterns intact — is available at github.com/DaveGold/mcp-building-profile-nl.

For the full five-phase pattern behind building these descriptions — including how AI discovers the domain knowledge that fills them — grab the white paper.