Inspiration

"Who watches the watchmen?"

As we transition into the Action Era, AI agents aren't just generating text—they are executing financial transactions, accessing sensitive PII, and autonomously navigating diverse enterprise environments. The current approach to safety (simple static filters) is fundamentally broken for agents that act.

We realized that for enterprises to trust "Marathon Agents" (long-running autonomous tasks), they need more than a simple log; they need a Governance Layer. We built PolicyGuard AI to be the "Automated Overlord" that validates, red-teams, and even heals agents against real-world laws (SOC2, GDPR) and security policies before and during deployment.

What it does

PolicyGuard AI is a Red-Teaming-as-a-Service platform and governance proxy that functions as the "immune system" for the Agentic Web. It operates three critical control planes:

  1. Compliance Engine (The Lawyer): It ingests static policy documents (PDF, Docx, MD) and uses Gemini Pro's multimodal reasoning to turn them into executable guardrails. It features a Forensic Digest system that generates SHA-256 hashes of every audit for non-repudiation.
  2. Red Team Mode (The Hacker): It uses adversarial AI personas to actively attack your agent, simulating "PII Exfiltration," "Jailbreak Injections," and "Compliance Hallucinations" to find vulnerabilities before they hit production.
  3. Self-Healing Lab (The Medic): When a vulnerability is found, PolicyGuard doesn't just block it. It analyzes the drift, generates a "hot-patch" (optimized system prompt), and programmatically redeploys the immunized agent configuration to "heal" the downstream agent.

How we built it

We architected PolicyGuard as a premium, high-performance "Mission Control" center:

  • AI Engine: We leveraged Google Gemini Pro and Flash. We built a Cognitive Architecture where Gemini assumes multiple distinct personas (Auditor, Attacker, Judge) to reason about complex legal text and adversarial prompts simultaneously.
  • Frontend: Built with Next.js 14, Tailwind CSS, and Framer Motion. We focused on "Vibe Engineering"—creating a dark-mode, glassmorphic dashboard with real-time telemetry, trust scores, and live trace monitoring.
  • Backend: Powered by FastAPI (Python) for high-performance async orchestration of the RAG pipeline and agentic workflows.
  • Vector Infrastructure: Uses semantic embeddings to index massive policy volumes, ensuring guardrails are context-aware and citation-grounded.

Challenges we ran into

  • The "Red Team" Balancing Act: It was difficult to tune the Gemini "Attacker" persona to be aggressive enough to bypass standard filters without breaking the structured JSON output required for our real-time audit feed.
  • Latency vs. Rigor: Running deep forensic audits on every transition can introduce lag. We solved this by implementing a Multi-Stage Triage: using Gemini Flash for high-speed initial scanning and escalating to Gemini Pro for complex policy reasoning.

Accomplishments that we're proud of

  • The Self-Healing Loop: We successfully built a loop where "AI hacks AI" and then "AI fixes AI." Watching PolicyGuard automatically generate a patch for a discovered PII leak was our biggest "aha!" moment.
  • Visual Shield: Successfully moved beyond text. Using Gemini to scan images and screenshots within an agent's workflow for PII or safety violations sets this apart from traditional guardrails.
  • Vibe Engineering: We didn't build an MVP; we built a product. The interface looks and feels like a Series A enterprise SaaS tool, making complex compliance data intuitive for human-in-the-loop (HITL) reviewers.

What's next for PolicyGuard AI

  • Live Agent Interception: Moving from a "Pre-deployment audit" to a "Real-time transparent proxy" that intercepts and sanitizes agent traffic in-flight.
  • Multi-Modal Attacks: Expanding the Red Team to simulate adversarial attacks using generated images and audio to test multimodal agent robustness.
  • CI/CD Integration: Direct plugins for GitHub Actions and LangChain to automatically grade and block "non-compliant" agent builds before they merge.

Built With

Share this project:

Updates