AgentGuard

landing page
Prompt Injection Vulnerability Eval
demonstration of AgentGaurd

Inspiration

We were inspired by the incident described in “The McKinsey Breach Was SQL Injection. The Real Threat Was 95 Writable System Prompts.” It highlighted a deeper issue: modern AI systems are highly vulnerable and unpredictable not just at the database level, but at the prompt and agent level. This made us realize that as AI agents become more accessible, especially to non-technical users, the risk of malicious prompt injection and unintended actions increases significantly. Many users, including our own peers, are already using agents without fully understanding these risks. We built AgentGuard to make AI agent usage safer by default, especially for beginners who may not recognize when an agent is behaving dangerously.

What it does

AgentGuard is a policy-enforcement gateway that sits between an AI agent’s decision and the execution of real-world actions (APIs, tools, file access, etc.). The live demonstration calls agents running on OpenAI and Google Gemini models to illustrate prompt injection, and how AgentGaurd prevents the agents from accessing sensitive files.

Before any action is executed, AgentGuard evaluates it and returns one of two decisions: ALLOW — Safe to proceed DENY — Blocked due to risk Each decision is accompanied by a clear, human-readable explanation and a full audit trail.

AgentGuard is a safety layer that sits between an AI agent and real-world actions, evaluating every request using policy rules, prompt-injection patterns, and data-flow analysis to determine whether to allow or block tasks. It provides clear explanations, audit logs, and a dashboard so developers and users can monitor, control, and safely deploy AI agents.

How we built it

Frontend: HTML/CSS/JS + Flask (Python)
Backend: FastAPI (Python)
Deployments: Render Blueprint
AI Integration: Gemini API (via OpenRouter) for advanced prompt verification and analysis The system is designed as a modular middleware layer that can plug into any agent pipeline.

Challenges we ran into

Defining the interception layer: Determining exactly where AgentGuard should sit in the agent pipeline (before execution, but after intent formation) required careful design.

Accomplishments that we're proud of

We took a complex and abstract problem (AI agent safety) and built a working, practical solution within a hackathon timeframe.
We created a system that not only protects advanced users but also lowers the barrier for non-technical users to safely adopt AI agents.
We combined deterministic rules with AI-based analysis to produce transparent, explainable decisions.

What we learned

AI safety is not just about model alignment, it’s about safely handling execution.
Human-in-the-loop systems are critical for real-world deployment.
Clear explanations are just as important as correct decisions when building trust in AI systems.

What's next for AgentGuard

A desktop-based system that can monitor and enforce policies across all locally running agents
Plug-ins for popular agent frameworks (LangChain, AutoGen, etc.)
More advanced dashboards for tracking agent behavior over time