Inspiration

MCP (Model Context Protocol) is the emerging standard for how AI agents connect to tools. Composio has hundreds of MCP servers. Every internal engineering team is wrapping their APIs as MCP. But the security tooling for this surface essentially doesn't exist yet — traditional appsec scanners weren't designed for MCP-specific vulnerability classes like prompt injection, credential leakage in tool responses, or scope violations between read and write operations. We wanted to build the auditor for this layer before someone gets badly burned.

What it does

MCP Auditor is an autonomous multi-agent system that performs security audits of MCP servers. Given a target MCP server URL, the system:

Enumerates the server's exposed tools via the MCP protocol Fans out six specialized prober agents in parallel, one per SAFE-T vulnerability class (credential leakage, prompt injection, unauthorized data access, scope violations, side effects, auth bypass) Governs every probe through Guild AI's permissioning layer — each probe is allowed/denied and logged to a tamper-evident audit trail Synthesizes findings into a cited security audit report (cited.md) where every claim is grounded in the SAFE-T taxonomy and the specific probe evidence Files the report as a GitHub issue via Composio Monetizes via x402 — the orchestrator returns HTTP 402 Payment Required and only runs the audit once payment proof is verified

Every component is observable. Every action is governed. Every claim is cited.

How we built it

Architecture: Multi-agent orchestration with an orchestrator routing across six parallel prober agents and a synthesizer agent. All probes route through a single governance executor that enforces permissioning policies before any tool call. Stack:

AWS Bedrock (Claude) for probe generation and finding synthesis Guild AI as the governance and audit layer — every probe gated and logged Composio for the live external action of filing audit reports as GitHub issues x402 for HTTP-native payment rails on the audit endpoint ClickHouse Cloud for findings storage and analytics Render for hosting the frontend dashboard and audit backend service Senso for publishing the cited audit report as a context source for downstream agents TypeScript + Bun for the agent runtime, Next.js for the frontend

Critical-path discipline: We scoped a hard "anti-fake-done gate" — bun run demo:local must exit zero with all artifacts (findings, audit trail, cited.md) present and non-empty. Stretch integrations (ClickHouse dashboards, Render deployment, real on-chain settlement) were explicitly cut-able if the critical path slipped.

ClickHouse Cloud integration (live): Findings from every audit land in a ClickHouse Cloud instance (us-central1) in the mcp_findings table. After a clean demo run the table contains 5 findings across 5 SAFE-T classes:

severity safeT tool prober
critical SAFE-T1502 get_config prober-credential-leakage
high SAFE-T1102 search_docs prober-description-poisoning
high SAFE-T1104 run_query prober-excessive-scope
high SAFE-T1106 read_file prober-path-traversal
high SAFE-T1402 send_notification prober-unvalidated-outbound

The store is fail-soft — if ClickHouse is unreachable, the dashboard falls back to in-memory rollups and the critical path is undisturbed.

Challenges we ran into

Multi-agent coordination under a time box. Six parallel probers with a shared governance executor and a synthesizer downstream means race conditions, audit log integrity, and deterministic ordering all matter. We designed the runner pattern to write through a single executor and verified determinism with three consecutive demo runs producing identical findings. MCP is a new attack surface with no standard tooling. We had to define what "good probes" look like for each SAFE-T class from scratch, using the taxonomy as scaffolding but writing the actual probe payloads ourselves. Sponsor integration without bolting on logos. Every sponsor in our stack does a load-bearing job. Guild governs every action. Composio files the live report. x402 gates the audit endpoint. ClickHouse stores findings analytically. Senso publishes the cited report. We resisted the urge to add sponsors that didn't have a natural role. Honesty about partial integrations. Several integrations have a "demo mode" and a "live mode" with degraded paths. Composio dry-run vs. live filing, x402 mock verification vs. real on-chain settlement, Senso offline vs. published. We documented every degraded path explicitly rather than overclaiming.

Accomplishments that we're proud of

Multi-agent fan-out actually runs in parallel under a single governance executor with a clean audit trail. Six probers, zero race conditions, deterministic across runs. Every finding is grounded. Our cited.md isn't decoration — every claim cites a SAFE-T entry, a specific probe, and the response evidence that triggered the classification. The build is structurally insulated. Stretch integrations like ClickHouse and Senso are fail-soft — they can be unconnected and the core demo path still exits zero. This let us add sponsors aggressively without destabilizing the critical path. End-to-end on Render. Frontend and backend deployed and verified; full audit runs on the public deploy with identical output to local. A real payment gate. x402 returns proper 402 responses, accepts payment proofs, and runs the audit only on verified payment.

What we learned

Agent infrastructure security is a real and growing gap, and the people building agent platforms know it. The SAFE-T taxonomy is a strong starting point but the practical work — writing probes that actually find real flaws — is still emerging. Multi-agent systems are far easier to coordinate when there's a single governance bottleneck that every action passes through. And the hardest design discipline in a one-day build is saying no to stretch integrations that would destabilize a working critical path.

What's next for MCP Server Auditor

Real on-chain x402 settlement with a CDP-managed wallet for production payments Adaptive severity classification via Pioneer's adaptive inference — the model improves as security engineers thumbs-up/down findings Expanded SAFE-T coverage beyond the initial six classes Continuous monitoring mode — webhook-triggered scans on every MCP server deployment A public registry of audited MCP servers with severity scores

Built With

  • anthropic-claude
  • aws-bedrock
  • bun
  • clickhouse
  • composio
  • github-actions
  • guild-ai
  • langgraph
  • mcp
  • multi-agent
  • nextjs
  • penetration-testing
  • render
  • security
  • senso
  • typescript
  • x402
Share this project:

Updates