Your MCP Server Should Get Smarter Every Week

Writing good tool descriptions is the start. Keeping them good is the real work.

The description blocks I covered in a previous post are the initial knowledge layer. But how do you know what's missing? How do you find the gaps the agent silently works around without telling you? That's where the feedback architecture comes in.

After running seven MCP servers in production, I settled on three layers (weighted roughly 70/20/10 by value) with one design choice that ties everything together.

Layer 1: Tool call logs (70% of the value)

Log every tool call: request parameters, response summary, session ID, timestamp. No conversation text, no user questions, no agent reasoning. Just the MCP layer.

This is the foundation. Three consecutive calls with tightening filters on the same table means the agent is struggling with something the metadata should have covered. You don't need the agent to tell you it's confused. The call pattern tells you.

In the production implementation, every tool call writes to a persistent store with: tool name, user, queryIntent, filter fields and operators (not values, for privacy), summaryOnly flag, row count, duration, and error type. Session IDs correlate multi-tool sequences.

Layer 2: Pattern analysis (20% of the value)

Weekly analysis of the raw logs. Look for repeated call sequences, redundant calls, unused tools that should have been used, tools called in unexpected order.

Session IDs make this possible. You can trace an entire multi-tool interaction from start to finish and ask: where did the agent take a detour? Where did it call tool A when the answer was in tool B? Where did it retry with different filters because the first call didn't return what it expected?

This finds silent failures: cases where the agent got a wrong answer without realizing it.

Layer 3: QueryIntent + report tool (10% of the value)

A queryIntent parameter on every tool call: one sentence describing the business question being answered. Plus a report tool the agent can use when genuinely stuck.

Here's where it becomes concrete. The same three-call sequence, with and without queryIntent:

Without:

09:14:03 | get_service_records | summaryOnly=true, project=P-7056

09:14:07 | get_service_records | type=13, project=P-7056

09:14:09 | get_service_records | type=13, project=P-7056, completed=true

Three calls. The last one added a filter. Why? Unknown.

With:

09:14:03 | intent: Overview of all service records for project P-7056

09:14:07 | intent: Which maintenance orders exist for this project

09:14:09 | intent: Only completed orders, previous call also returned open items

Now you see exactly what happened: the agent expected only completed items but the metadata didn't make clear that a completion filter is needed. That's a targeted metadata fix requiring minutes to implement.

Nudge patterns: making feedback happen

Agents don't self-report reliably. Academic research shows mixed results on LLM metacognition. So instead of relying on the agent's initiative, the production implementation injects contextual nudges into tool responses:

Empty results: "No records matched your filters. If unexpected, consider calling report_problem."
Pagination detected: "You're paginating (skip=200). If you're doing this to manually aggregate data, call report_problem. We may be able to add that summary dimension."
Every response: A feedback reminder as the last alert: "If this query took multiple attempts or returned confusing data, call report_problem before your next step."

These nudges are injected by the server's handler factory, not by the agent's judgment. They target specific friction patterns and give the agent a low-effort path to report issues it would otherwise silently work around.

Why this compounds

Each resolved issue makes the system permanently smarter. A metadata update isn't a patch. It eliminates an entire category of uncertainty for every future session, every connected agent. The metadata layer grows richer over time while the reasoning layer stays the same. Cheap, permanent, compounding progress.

Go deeper: read the full practitioner report, The Missing Layer, or explore the working code in the mcp-metadata-demo server.