Inspiration

Every team building on LLMs reinvents the same four safety primitives: arg validation, egress firewall, JSON repair, context fitting. The Model Context Protocol gives us the first credible standard for sharing tools across agents — so I built one MCP server that ships all of them.

What it does

agent-safety-mcp is an MCP server that gives any LLM agent (Claude, Gemini, GPT-4, anything MCP-aware) a built-in safety layer. Six tools:

  • validate_args — catch hallucinated tool arguments before they reach the tool, with a structured retry hint
  • check_egress — declarative URL allowlist enforced at MCP-call time, stops data leaks
  • extract_json — pull JSON out of fenced or chatty model output
  • fit_messages + count_tokens — keep chat history under the model context budget
  • diff_snapshot — regression-test agent traces across runs

Each tool wraps a separate, MIT-licensed npm library (the @mukundakatta agent-stack) — built and shipped before this hackathon, but never composed into one MCP surface until now.

How I built it

  • TypeScript + Node 22
  • @modelcontextprotocol/sdk v1.29
  • Five published npm libs as zero-dep primitives
  • Stdio + Streamable HTTP transports
  • Dockerized for Google Cloud Run
  • Live demo: Gemini 2.0 Flash on Vertex AI calls the MCP server via @google/genai's mcpToTool helper

Challenges I ran into

Cloud Run on a fresh project failed the first deploy with PERMISSION_DENIED on Cloud Build's source bucket. Fix: grant cloudbuild.builds.builder + logging.logWriter to the project's default Compute service account. Second deploy succeeded.

Accomplishments that I'm proud of

End-to-end working stack — agent calls MCP, MCP enforces safety — deployed to a public Cloud Run endpoint that any MCP client can register in one line of config. All six tools verified live via curl + the MCP TS client SDK.

What I learned

MCP's Streamable HTTP transport finally makes "shared agent tooling on a public URL" a real pattern instead of a slideware proposal. Composing existing npm libs into MCP tools is dramatically faster than building agent-side integrations.

What's next

  • Persistent allowlists per session (policies survive across MCP requests)
  • A second tool layer wrapping the Python siblings (driftvane, bedrock-kit, cachebench)
  • An OpenTelemetry exporter for diff_snapshot results so drift shows up in dashboards

Built With

Share this project:

Updates