MCP Tool Orchestrator

Found 1 failed tool call and fixed it by changing the tool defination
Expectation vs. Realty on MCP servers
How to use

Agentic MCP Server vs Reality

Every agentic platform loves to slap an “MCP-supported” badge on its marketing site, yet the moment you switch from a tidy demo to a real production workflow, you discover just how brittle today’s MCP landscape is:

Model-agnostic ≠ model-aware. A JSON tool schema may be standard, but each model interprets it differently; without per-model tuning and evaluation you will leak quality fast.
Context windows aren’t respected. One misplaced call in a long chain can exhaust Anthropic’s context window and crash the run.
Silent truncation. GPT-4o quietly chops tool descriptions after ~1 024 chars—often mid-sentence—leaving the agent half-informed.
Reality check. In our own benchmarks, even handcrafted agents pick the wrong tool roughly 50 % of the time—hardly “personal-assistant” grade.
Expectation inflation. Models will improve, but user expectations rise even faster. “Divinely discontent” is the default state.

How It Works

The optimizer generates 5 test scenarios for your tool
It evaluates the current description against these scenarios using the selected AI model
It collects feedback from failed evaluations
It uses the AI model to create an improved description addressing the feedback
It repeats the process until either:
- The description passes 4/5 test scenarios
- The maximum improvement iterations (3) are reached

The result is a more clear, accurate and efficient tool description that helps AI models better understand how to use your tool.

What’s next for MCP Tool Orchestrator

** Online tool definition registry**: So users can share their already optimized tools to a common registry and other users can use those tools for their models.

Ready to try it? Clone the repo!

Built With

evals
llm
typescript

Updates

Umut YILDIRIM started this project — May 17, 2025 08:44 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.