ARGUS — AI Agent Monitoring & Threat Intelligence

ARGUS SAFE

⚠️ Note for judges: Agent responses take 30-60 seconds due to the dual LLM pipeline (Vertex AI worker agent + Gemini 3.5 Flash ARGUS safety check running sequentially). This is expected behavior. Please allow up to 90 seconds per request when testing.

Inspiration

As AI agents proliferate across enterprises — handling medical triage, financial fraud detection, legal compliance, and customer support — a critical blind spot emerges: who watches the agents? They can be hijacked via prompt injection, communicate secretly with each other, or produce dangerous outputs, all without any human noticing. I built ARGUS to solve this.

What it does

ARGUS is a real-time AI safety monitoring system for multi-agent environments. It monitors 15 simulated AI agents across 5 industries — detecting prompt injection attacks, jailbreak attempts, rogue inter-agent communications, and dangerous outputs — all in real time.

What makes ARGUS unique: ARGUS is itself an AI agent built with Google ADK and Gemini 3.5 Flash. It reads its own observability data via Arize Phoenix MCP to continuously improve its threat detection rules — a closed-loop self-improving safety system.

Key capabilities:

Prompt Injection Detection — scans all agent inputs for injection patterns
Output Interception — screens all agent outputs before delivery
Rogue Communication Detection — validates inter-agent message flows
LLM-as-a-Judge Evaluation — quality scoring on every agent response
Self-Improvement Loop — ARGUS queries Phoenix traces and updates its own rules

How we built it

Agent Runtime: Google ADK (Agent Development Kit) powers all 16 agents including ARGUS itself.

LLM: Gemini 3.5 Flash for ARGUS monitor, Gemini 2.5 Flash Lite for the 15 worker agents via Vertex AI.

Observability: Arize Phoenix Cloud with OpenInference instrumentation. Every agent span, tool call, and LLM response is traced with OpenTelemetry.

Self-Improvement: ARGUS uses the Phoenix MCP server to query its own traces, detect threat patterns, synthesize new detection rules, and apply them autonomously — upgrading its ruleset from v2.0.4 to v2.0.5 after each HIGH+ event.

Backend: FastAPI serving 16 agents + ARGUS endpoints, deployed on Google Cloud Run.

Frontend: Next.js dashboard with real-time threat monitoring, agent console, Phoenix trace viewer, and threat center.

Challenges we ran into

Dual API architecture: Worker agents use Vertex AI while ARGUS uses Google AI Studio for Gemini 3.5 Flash. Managing environment variable switching between two different auth systems required an asyncio lock to prevent race conditions in concurrent requests.
False positive prevention: Early versions of ARGUS flagged legitimate medical queries (drug interaction checks) as threats because they contained words like "injection." Tuning the threat taxonomy required careful context-aware pattern matching and semantic evaluation.
OpenInference instrumentation: Getting the GoogleADKInstrumentor to correctly populate input/output fields in Phoenix required upgrading from v0.1.3 to v0.1.15 to match the ADK 2.1.0 API.
Self-improvement loop latency: Each ARGUS analysis takes 20-45 seconds due to multi-tool chaining. Implemented asyncio timeouts and graceful fallbacks to keep the system responsive.

Accomplishments that we're proud of

Built a fully working self-improving AI safety system in under 3 weeks, solo
Real traces flowing into Arize Phoenix with input/output populated and green status in Phoenix dashboard
ARGUS successfully detects and blocks prompt injection, jailbreak attempts, and system prompt extraction in real time
The self-improvement loop actually works — ARGUS reads its Phoenix traces and generates new detection rules autonomously
15 real AI agents across 5 industries all properly instrumented and monitored

What we learned

Google ADK is incredibly powerful for building multi-agent systems — the Runner and InMemorySessionService make agent orchestration straightforward
Arize Phoenix MCP is a game-changer for AI observability — being able to query traces programmatically enables genuinely autonomous self-improvement
OpenInference instrumentation is the bridge between ADK and Phoenix — getting it right unlocks the full observability pipeline
AI safety is not just about filters — it requires contextual judgment, proportional response, and continuous learning from real-world data

What's next for ARGUS — AI Agent Monitoring & Threat Intelligence

Real-time WebSocket updates — live threat feed without polling
Multi-tenant support — monitor multiple organizations' agent fleets
Threat correlation engine — detect coordinated attacks across agents
Automated remediation — ARGUS autonomously quarantines compromised agents
Integration marketplace — connect to any agent framework (LangChain, CrewAI, AutoGen) not just Google ADK

Built With

arize-phoenix
fastapi
gemini-2.5-flash-lite
gemini-3.5-flash
google-adk
google-cloud-run
next.js
openinference
opentelemetry
phoenix-mcp
python
typescript
vertex-ai

Updates

Abiraminayagi S started this project — Jun 11, 2026 12:23 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.