## Inspiration
AI agent frameworks like LangChain, CrewAI, and OpenAI function calling are exploding in adoption — but nobody is securing the boundary between agents and their tools. Real attack vectors exist today:
- A web scrape returns hidden text that hijacks the agent's behavior
- An agent gets tricked into accessing
169.254.169.254— the AWS metadata endpoint that leaks IAM credentials - A research-only agent gets manipulated into executing shell commands
We kept seeing these vulnerabilities discussed in security research but no practical, framework-agnostic tool existed to defend against them. So we built one.
## What it does
ClawGuard is a real-time security gateway that sits between AI agents and their tools. Instead of calling a tool directly, agents route requests through ClawGuard's proxy, which runs three security layers before the response reaches the agent:
- Policy Enforcement — YAML-defined per-agent tool permissions with glob patterns and CIDR-based network deny lists that block SSRF before the request ever leaves
- Prompt Injection Detection — 10 weighted regex patterns with cumulative scoring, plus an optional Groq LLM classifier for ambiguous cases. Includes Unicode zero-width character evasion detection and NFKC normalization
- Real-Time Audit Trail — Every interaction is logged to SQLite, risk-scored, and streamed to a live dashboard via WebSocket
It's completely framework-agnostic. Any AI agent that makes HTTP calls can route through it with a single line change.
## How we built it
Backend: Python with FastAPI, fully async. The proxy pipeline flows through policy check → httpx forwarding → response scanning → return/block. Detection uses a weighted scoring system where each regex pattern has a severity weight (0.3–0.95), and a cumulative score determines the risk level. For medium-risk ambiguous cases, an optional Groq LLM classifier (llama-3.3-70b) provides semantic analysis with a 3-second timeout.
Frontend: Next.js with TypeScript and Tailwind CSS. The dashboard connects via WebSocket for real-time event streaming — you can watch attacks get blocked as they happen. Recharts for threat timeline visualization.
SDK: A pip-installable Python package (clawguard) with an async client, a @protect decorator that scans
function return values, and ready-made wrappers for LangChain and CrewAI.
Testing: 99 tests total (87 backend + 12 SDK) covering pattern detection, weighted scoring thresholds, policy evaluation, end-to-end proxy integration, and all three demo attack scenarios.
## Challenges we faced Weighted scoring design — Binary match/no-match wasn't good enough. A single base64 string shouldn't block a response, but base64 combined with "ignore previous instructions" should. We landed on cumulative weighted scoring with per-pattern deduplication — each pattern contributes its weight once, avoiding inflation from repeated matches.
Unicode evasion — Attackers can insert zero-width characters between words to bypass regex. "ignore" looks like "ignore" to humans but not to pattern matchers. We solved this with NFKC normalization and dual-pass scanning — patterns run against both original text (to detect the evasion itself) and normalized text (to catch the hidden payload).
Async DNS — The initial policy engine used blocking socket.gethostbyname() for SSRF checks, which stalled the entire event loop under load. Replaced with asyncio.getaddrinfo() to keep the proxy non-blocking.
LLM response parsing — Groq's LLM sometimes wraps JSON in markdown fences, adds preamble text, or returns out-of-range confidence values. We built a robust extraction pipeline with regex-based JSON finding, required key validation, and confidence clamping.
## What we learned
- The agent-to-tool boundary is a genuinely underserved attack surface with practical exploits
- Weighted cumulative scoring is significantly more useful than binary detection for security classification
- Fail-closed security requires thinking about every edge case — LLM timeout, DNS failure, malformed policy files — each needs an explicit decision about whether to block or pass
## What's next
- Multi-agent dashboard with filtering and per-agent analytics
- Webhook/Slack alerting on critical threats
- Policy editor UI with hot-reload
- PyPI publication for the SDK
Log in or sign up for Devpost to join the conversation.