Inspiration
We were inspired by the incident described in “The McKinsey Breach Was SQL Injection. The Real Threat Was 95 Writable System Prompts.” It highlighted a deeper issue: modern AI systems are highly vulnerable and unpredictable not just at the database level, but at the prompt and agent level. This made us realize that as AI agents become more accessible, especially to non-technical users, the risk of malicious prompt injection and unintended actions increases significantly. Many users, including our own peers, are already using agents without fully understanding these risks. We built AgentGuard to make AI agent usage safer by default, especially for beginners who may not recognize when an agent is behaving dangerously.
What it does
AgentGuard is a policy-enforcement gateway that sits between an AI agent’s decision and the execution of real-world actions (APIs, tools, file access, etc.). The live demonstration calls agents running on OpenAI and Google Gemini models to illustrate prompt injection, and how AgentGaurd prevents the agents from accessing sensitive files.
Before any action is executed, AgentGuard evaluates it and returns one of two decisions: ALLOW — Safe to proceed DENY — Blocked due to risk Each decision is accompanied by a clear, human-readable explanation and a full audit trail.
AgentGuard is a safety layer that sits between an AI agent and real-world actions, evaluating every request using policy rules, prompt-injection patterns, and data-flow analysis to determine whether to allow or block tasks. It provides clear explanations, audit logs, and a dashboard so developers and users can monitor, control, and safely deploy AI agents.
How we built it
- Frontend: HTML/CSS/JS + Flask (Python)
- Backend: FastAPI (Python)
- Deployments: Render Blueprint
- AI Integration: Gemini API (via OpenRouter) for advanced prompt verification and analysis The system is designed as a modular middleware layer that can plug into any agent pipeline.
Challenges we ran into
- Defining the interception layer: Determining exactly where AgentGuard should sit in the agent pipeline (before execution, but after intent formation) required careful design.
Accomplishments that we're proud of
- We took a complex and abstract problem (AI agent safety) and built a working, practical solution within a hackathon timeframe.
- We created a system that not only protects advanced users but also lowers the barrier for non-technical users to safely adopt AI agents.
- We combined deterministic rules with AI-based analysis to produce transparent, explainable decisions.
What we learned
- AI safety is not just about model alignment, it’s about safely handling execution.
- Human-in-the-loop systems are critical for real-world deployment.
- Clear explanations are just as important as correct decisions when building trust in AI systems.
What's next for AgentGuard
- A desktop-based system that can monitor and enforce policies across all locally running agents
- Plug-ins for popular agent frameworks (LangChain, AutoGen, etc.)
- More advanced dashboards for tracking agent behavior over time
Built With
- claude
- cloudflare
- css
- fastapi
- flask
- gemini
- github
- html
- javascript
- openai
- openrouter
- porkbun
- python
- render
Log in or sign up for Devpost to join the conversation.