LLM Guardian Agent (Google AI + MongoDB)

with AI Self Improving Threat Knowledge Development Agent

Inspiration

Real world problem: LLM Threat Attacks

Solution: Autonomous LLM Threat Detection Tool with continuous LLM Threat Knowledge Development

The EU AI Act mandates that AI systems be protected against adversarial attacks and that all interactions be logged for audit and accountability — legal obligations, not optional practices. Most teams shipping LLM-powered products treat prompt injection, jailbreaks, credential harvesting, and data exfiltration as afterthoughts. The regulation signals where the industry is heading: AI protection and audit logging will become baseline requirements.

A Devpost notification about this hackathon arrived at the right moment. It brought Google AI and MongoDB together on one track — exactly the right stack for a serious AI security solution: Gemini for intelligence, MongoDB Atlas for the data layer, audit logging, and knowledge persistence. The requirement to build on Google Cloud's Agent Platform led us to Agent Garden, where Cyber Guardian demonstrated how to compose a production-grade multi-agent system with ADK. That became the structural foundation for LLM Guardian.

From Cyber Guardian to LLM Guardian: A Rapid Agent Build

We didn't start from a blank slate. Cyber Guardian — a sample multi-agent system from Google's Agent Garden — was our starting point. It's designed for traditional cybersecurity incident response: an orchestrator coordinating specialized sub-agents for triage, threat intel, investigation, and response.

We ran it, studied how it was built, and used it as our reference for how a production-grade multi-agent system should be structured with ADK. Seeing it work end-to-end gave us confidence in the pattern and let us skip the scaffolding phase entirely — we knew what a working multi-agent ADK system looked like before we wrote a single line of LLM Guardian.

From there, we replaced every component with LLM-security-specific logic: new sub-agents for the full OWASP LLM threat lifecycle, MongoDB Atlas as the persistent data layer, a two-tier detection architecture, and a self-building threat knowledge library. The domain changed completely; the ADK composition pattern carried over. That's what made the build fast.

What it does

LLM Guardian Agent is a hierarchical, agentic security system that protects LLM applications against the OWASP Top 10 for LLM Applications 2025. It treats AI security and governance as three inseparable disciplines — protection, monitoring, and review — and delivers all three:

  • Protection — inbound prompts and outbound responses are scanned in real time for prompt injection, jailbreaks, credential harvesting, model hijacking, data exfiltration, PII leakage, and policy violations.
  • Monitoring — every interaction is logged with full audit trails, incident records, and OWASP category mapping, giving teams visibility into what their LLM is actually exposed to.
  • Review — findings are synthesized into structured incident reports with mitigation strategies, so security and governance teams can act, not just observe.

The system runs in two tiers:

  • Tier 1 — Semantic threat knowledge detection. A first line of defense that runs every inbound prompt against the self-building threat_knowledge library using Voyage AI 4 vector search. Matches known and previously learned attack patterns with high precision — and gets smarter with every novel threat T2 discovers.
  • Tier 2 — Bad-actor intention detection & knowledge development. A deeper reasoning layer that analyzes intent and conversational context to catch multi-step and indirect attacks, then feeds what it learns back into a continuously growing threat knowledge library.

Most importantly, the guardian is self-motivated: it doesn't just detect threats, it autonomously builds and expands its own threat knowledge base over time — entirely driven by Gemini and MongoDB.

System Components

LLM Guardian is composed of three components, all backed by MongoDB Atlas as the shared data layer.


Component 1 — LLM Guardian Agent

The core agentic security engine. A Python ADK application that runs the full multi-agent detection pipeline — accepting raw prompts or model responses and returning structured security verdicts. Built on Google ADK 2.1.0, Gemini via Vertex AI, and MongoDB Atlas, with Voyage AI 4 for semantic embeddings.


Component 2 — LLM Guardian Console

A Next.js governance dashboard that connects directly to MongoDB Atlas and provides a real-time view into everything the agent logs. Security and governance teams use it to monitor scans, investigate incidents, manage detection rules, and browse the self-building threat knowledge library.

Page What it shows
/ Dashboard — live scan counts, risk distribution, recent incidents
/scans All scan requests with verdict (BLOCK / WARN / ALLOW), risk score, and OWASP categories
/incidents Full incident reports with threat details and mitigation recommendations
/audit-trail Forensic audit log — every step from request intake to final verdict
/detection-rules Browse and manage all 8 threat type detection rules
/threat-knowledge The self-building knowledge library — AI-inferred threats persisted by the knowledge developer agent
/pii-registry PII patterns and compliance framework tags
/content-policy Content policy rules for unsafe content scanning

Component 3 — MongoDB Atlas

The shared AI-ready data layer for the entire system. Every detection rule, PII pattern, content policy, threat knowledge record, audit log, security report, and incident is stored in Atlas. The agent writes; the console reads. Both components are stateless — Atlas is the single source of truth.

Atlas features used:

  • Vector Search via autoEmbed — the threat knowledge library is vector-indexed using an Atlas autoEmbed index (Voyage AI 4, 1024-dim). Agents call MongoDB MCP aggregate with plain text — Atlas calls Voyage AI internally and returns semantically matched threats. No pre-computation needed in the application.
  • Atlas Search — full-text $search used alongside vector search for exact indicator matching in keyword-based threat lookup.
  • MongoDB MCP server — agents call Atlas as a first-class tool via the mongodb-mcp-server npm package and ADK McpToolset. The detection agent uses MCP aggregate for semantic threat search on every scan; the knowledge developer uses MCP aggregate for duplicate detection and MCP insert-many to persist new threat records.
  • Standard indexes — fast rule lookups and audit queries across all collections.
┌─────────────────────────────────────────────────────┐
│                  GCP Cloud Run                      │
│                                                     │
│  ┌─────────────────────┐  ┌──────────────────────┐  │
│  │  llm-guardian-agent │  │ llm-guardian-console │  │
│  │  (Python / ADK)     │  │  (Next.js / Node)    │  │
│  │  gemini-3.5-flash   │  │                      │  │
│  │  + 3.1-flash-lite   │  │  Governance UI       │  │
│  └──────────┬──────────┘  └──────────┬───────────┘  │
│             │                        │              │
│    Vertex AI (Gemini)        GCP Secret Manager     │
└─────────────┼────────────────────────┼──────────────┘
              │                        │
              └──────────┬─────────────┘
                         │
              ┌──────────▼──────────┐
              │   MongoDB Atlas M0  │
              │   (Free Tier)       │
              │   8 collections     │
              │   Vector Search     │
              └─────────────────────┘

ADK Implementation: Agents and Tools

2-Tier (T1 & T2) Multi-Layer Detection Framework

                        ┌─────────────────────────────────────────────────┐
                        │         llm_guardian_orchestrator               │
                        │         gemini-3.5-flash (ORCHESTRATOR_MODEL)   │
                        │                                                 │
                        │  FunctionTools (audit lifecycle):               │
                        │    start_audit_log                              │
                        │    record_agent_result                          │
                        │    finalize_audit_log                           │
                        │                                                 │
                        │  AgentTools (sub-agents):                       │
                        │    detection_agent     analysis_agent           │
                        │    t2_analyst_agent    knowledge_developer_agent│
                        │    response_agent      remediation_agent        │
                        └───────────────┬─────────────────────────────────┘
                                        │ every call → record_agent_result
                                        │
          ┌─────────────────────────────┼───────────────────────────────┐
          ▼                             ▼                               ▼
  ┌───────────────┐           ┌─────────────────┐             ┌──────────────────┐
  │ detection_    │           │  t2_analyst_    │             │  response_agent  │
  │ agent (T1)    │           │  agent (T2)     │             │  gemini-3.1-     │
  │ gemini-3.1-   │           │  gemini-3.1-    │             │  flash-lite      │
  │ flash-lite    │           │  flash-lite     │             │  (outbound path) │
  └───────┬───────┘           └────────┬────────┘             └──────────────────┘
          │                            │ risk > 0.15
          │                   ┌────────▼──────────────┐
          │                   │  knowledge_developer_ │
          │                   │  agent                │
          │                   │  gemini-3.5-flash     │
          │                   │  (self-learning loop) │
          │                   └────────┬──────────────┘
          │ threats detected           │
          └──────────┬─────────────────┘
                     │
                     ▼
            ┌─────────────────┐
            │  analysis_agent │
            │  gemini-3.1-    │
            │  flash-lite     │
            └────────┬────────┘
                     │
                     ▼
            ┌─────────────────┐
            │ remediation_    │
            │ agent           │
            │ gemini-3.1-     │
            │ flash-lite      │
            └────────┬────────┘
                     │
                     ▼
            ┌─────────────────┐
            │  MongoDB Atlas  │
            │  (all writes)   │
            └─────────────────┘

Architecture Overview

LLM Guardian is a hierarchical multi-agent system built with Google ADK 2.1.0. The root orchestrator holds six sub-agents as AgentTool instances, meaning the orchestrator calls each as a tool, receives results inline, and decides what to do next — a key difference from Cyber Guardian's passive sub_agents=[] dispatch.

llm_guardian_orchestrator  (gemini-3.5-flash, orchestrator model)
│
├── [FunctionTool] start_audit_log          — opens a forensic audit record in MongoDB
├── [FunctionTool] record_agent_result      — stamps each sub-agent's output into the record
├── [FunctionTool] finalize_audit_log       — seals and persists the complete record
│
├── [AgentTool] detection_agent             — Tier 1 inbound threat scanner
├── [AgentTool] analysis_agent              — deep context/conversation scanner
├── [AgentTool] response_agent              — outbound response scanner
├── [AgentTool] remediation_agent           — OWASP mapper + report generator
├── [AgentTool] t2_analyst_agent            — Tier 2 novel/obfuscated threat classifier
└── [AgentTool] knowledge_developer_agent  — autonomous threat knowledge builder

Every run follows a deterministic 7-step pipeline encoded directly in the orchestrator's system prompt: start audit → detect → (T2 fallback if clean) → (knowledge develop if T2 fires) → analyze → remediate → finalize audit. Verdicts (BLOCK / WARN / ALLOW) are computed from a probabilistic risk formula, not LLM judgment.


Root Orchestrator — llm_guardian_orchestrator

Model: gemini-3.5-flash (latest, via ORCHESTRATOR_MODEL_ID)
File: llm_guardian/agent.py

The orchestrator is the core coordination brain — powered by gemini-3.5-flash, the most capable model in the system. Its system prompt forbids it from doing analysis — it may only route, call tools, pass JSON, and emit the final report. It owns three audit FunctionTools directly and wraps each sub-agent as an AgentTool. This design means the orchestrator never loses intermediate results: each sub-agent response comes back as a tool return value, and record_agent_result immediately stamps it into the MongoDB audit log before the next step. Using gemini-3.5-flash for orchestration ensures robust multi-step reasoning, accurate verdict determination, and correct pipeline sequencing across all 7 steps — the place where mistakes are most costly.


Detection Agent — detection_agent

Model: gemini-3.1-flash-lite (fast tier — large context, low latency)
File: llm_guardian/sub_agents/detection/
Output key: detection_agent_output

The Tier 1 scanner. Runs 10 tools in parallel on every inbound prompt — two against the self-building threat_knowledge library and eight OWASP rule-based detectors:

Tool Transport What it does
aggregate (autoembed_index) MongoDB MCP Semantic vector search via $vectorSearch using the Atlas autoEmbed index — plain text goes in, Atlas calls Voyage AI 4 internally and returns scored matches from threat_knowledge. Score ≥ 0.45 surfaces confirmed malicious patterns.
query_threat_knowledge_keyword FunctionTool Atlas full-text $search against threat_knowledge indicator, threat name, and description fields — catches exact phrase matches semantic search may miss.
detect_jailbreak_patterns FunctionTool LLM01/LLM07 — DAN mode, role-play, instruction overrides; regex + keyword rules from MongoDB detection_rules.
detect_credential_harvesting FunctionTool LLM02 — credential and PII extraction intent; rules from MongoDB.
detect_data_exfiltration FunctionTool LLM06 — bulk data extraction, policy circumvention; rules from MongoDB.
detect_hallucination FunctionTool LLM09 — unsupported factual claims; claim indicators from MongoDB.
detect_supply_chain_manipulation FunctionTool LLM03 — plugin load, package manipulation; rules from MongoDB.
detect_data_poisoning FunctionTool LLM04 — behavioral anchoring, training injection; rules from MongoDB.
detect_embedding_attack FunctionTool LLM08 — RAG poisoning, embedding extraction; rules from MongoDB.
detect_resource_exhaustion FunctionTool LLM10 — loop bombs, oversized inputs; rules from MongoDB.

The MongoDB MCP aggregate call is the live connection to the self-learning knowledge library — every novel attack the knowledge developer persists is immediately searchable on the next scan, without any code change.


Analysis Agent — analysis_agent

Model: gemini-3.1-flash-lite
File: llm_guardian/sub_agents/analysis/
Output key: analysis_agent_output

Deep context scanner for multi-turn conversation history. Called when Tier 1 fires threats, or after T2 confirms a novel attack. Three tools:

Tool What it does
analyze_context_window Scans conversation turns for role-play escalation chains (≥2 turns = HIGH) and instruction override patterns (CRITICAL)
detect_system_prompt_extraction Detects extraction attempts against system prompt across conversation history; flags if system prompt snippet is exposed
scan_conversation_history Maps each turn against override, extraction, and jailbreak keyword lists from MongoDB; builds a sequential attack chain

All keyword lists are loaded from MongoDB (context_window, system_prompt_extraction, conversation_history rule types).


Response Agent — response_agent

Model: gemini-3.1-flash-lite
File: llm_guardian/sub_agents/response/
Output key: response_agent_output

Outbound scanner for model-generated responses. Called when the orchestrator classifies input as an outbound response. Four tools:

Tool What it does
scan_pii_leakage Regex scan against PII patterns from MongoDB pii_registry; reports compliance framework per match
scan_credential_exposure Hardcoded regex patterns for API keys, AWS secrets, password hashes, JWT tokens in model output
scan_hallucination Extracts factual claims from response, runs vector search against threat_knowledge in MongoDB; low match score flags potential hallucination
scan_unsafe_content Regex against content_policy_rules from MongoDB; catches harmful, illegal, or policy-violating content

T2 Analyst Agent — t2_analyst_agent

Model: gemini-3.1-flash-lite
File: llm_guardian/sub_agents/t2_analyst/
Output key: t2_analyst_output

Tier 2 fallback — only invoked when Tier 1 returns zero threats and risk ≤ 0.15. Uses pure rule-based structural analysis combined with the LLM's security reasoning to catch novel, obfuscated, or indirect attacks that don't match known patterns. Three tools:

Tool What it does
extract_attack_features Counts imperative commands, social engineering phrases, privilege escalation language, and obfuscation signals (base64, Unicode tricks, hex encoding); produces a suspicion_indicators score
assess_malicious_intent Scores five intent dimensions independently (deception, coercion, extraction, manipulation, bypass), combines them into an intent_score 0–1, identifies the dominant attack dimension
check_known_attack_taxonomy Validates the LLM's proposed attack classification against the full OWASP LLM Top 10 2025 taxonomy; determines is_novel_attack and explains novelty if the pattern variant or category is new

The T2 agent has no MongoDB dependency — its tools are pure logic, and the LLM synthesizes a final classification on top of the structural signals.


Knowledge Developer Agent — knowledge_developer_agent

Model: gemini-3.5-flash (same as orchestrator — requires deep reasoning to decide what's worth learning)
File: llm_guardian/sub_agents/knowledge_developer/
Output key: knowledge_developer_output

The self-learning engine. Mandatory when T2 returns risk > 0.15. Converts a novel attack detection into a durable MongoDB record so Tier 1 catches it next time. Uses MongoDB MCP for both the duplicate check and the insert:

Tool Transport What it does
aggregate (autoembed_index) MongoDB MCP Semantic duplicate check — $vectorSearch against threat_knowledge using the Atlas autoEmbed index. If any match scores ≥ 0.53, the record is a near-duplicate and is skipped.
insert-many MongoDB MCP Persists the assembled threat record to threat_knowledge. The Atlas autoEmbed index generates the Voyage AI 4 vector embedding from the indicator field at insert time — no pre-computation step needed.
build_threat_record FunctionTool Pure document assembler — derives canonical indicator, maps to OWASP, validates pattern type, returns the ready-to-insert dict. No MongoDB calls; the LLM passes the result directly to MCP insert-many.
generate_regex_pattern FunctionTool Novel attacks only — extracts imperative openers, persona-shift phrases, and extraction requests from the input and builds a generalised regex fingerprint.

This agent is the only one that writes to threat_knowledge. After MCP insert, the new record is immediately searchable by the detection agent's MCP semantic search on the next scan — closing the self-learning loop end to end.


Remediation Agent — remediation_agent

Model: gemini-3.1-flash-lite
File: llm_guardian/sub_agents/remediation/
Output key: remediation_agent_output

Final synthesis agent. Always called — even on clean scans (for audit compliance). Four tools:

Tool What it does
map_to_owasp Maps each detected threat type to its OWASP LLM Top 10 2025 category via a normalised lookup table covering all 10 categories
generate_security_report Consolidates threats from all agents, computes probabilistic risk score (1 − ∏(1 − impact × confidence)), generates a structured incident report, writes to security_reports and threat_logs in MongoDB
recommend_mitigations Produces prioritized, actionable mitigation steps per OWASP category; HIGH priority for CRITICAL/HIGH severity threats
log_audit_entry Writes to audit_trail in MongoDB only when a BLOCK or WARN action was taken — avoids noise from clean scans

Audit Lifecycle — Orchestrator FunctionTools

Three FunctionTools are held directly by the orchestrator (not sub-agents):

Tool What it does
start_audit_log Opens a new audit record with a unique request_id; stores input text and timestamp in MongoDB audit_logs
record_agent_result Stamps each sub-agent's output into the active audit record by request_id; uses explicit ID passing to avoid ContextVar threading bugs
finalize_audit_log Seals the complete audit record with final recommendation, risk score, incident ID, and OWASP categories; persists to audit_logs

MongoDB Atlas as the AI-Ready Data Layer

Collections and how agents access them:

Collection Access method Agent
threat_knowledge MongoDB MCP aggregate (semantic search) detection_agent — reads on every scan
threat_knowledge MongoDB MCP aggregate (duplicate check) + insert-many knowledge_developer_agent — writes on novel threat discovery
detection_rules Direct pymongo (get_detection_rules) detection_agent — 8 OWASP rule-based detectors
pii_registry Direct pymongo response_agent — PII leakage scan
content_policy_rules Direct pymongo response_agent — unsafe content scan
audit_logs Direct pymongo orchestrator — one record per request
security_reports Direct pymongo remediation_agent — structured incident report
threat_logs Direct pymongo remediation_agent — risk ≥ 0.35 incidents
audit_trail Direct pymongo remediation_agent — BLOCK/WARN decisions only

The two highest-value runtime data flows — detecting known threats and learning new ones — both go through MongoDB MCP, making those the clearest example of the MongoDB + ADK MCP integration in production use.

How we built it

We built LLM Guardian on Google's Agent Development Kit (ADK) 2.1.0 and Gemini models, with MongoDB Atlas as the AI-ready data and knowledge layer.

  • Google ADK 2.1.0 gave us a clean, productive framework for composing a hierarchical multi-agent system: a root orchestrator coordinating specialized sub-agents for detection, analysis, response, and remediation. We used Google's own Cyber Guardian sample from Agent Garden as our structural starting point — it showed us the ADK composition patterns and let us focus immediately on the security domain rather than scaffolding. ADK 2.1.0 brought improved AgentTool stability and session state propagation that was essential for our deterministic 7-step pipeline.
  • Gemini models power the intelligence, and we tier them by job:
    • gemini-3.1-flash-lite for fast, high-volume processing across all five analysis sub-agents (detection, analysis, response, remediation, T2 analyst) — optimised for low latency and large context throughput, handling the bulk of every scan.
    • gemini-3.5-flash (latest) as the core orchestration brain — it coordinates the entire multi-agent pipeline, decides routing, synthesizes findings from all sub-agents, applies deterministic verdict logic, and produces the final structured security report. Using the most capable model here ensures no coordination errors across the 7-step pipeline.
  • MongoDB Atlas serves as the AI-ready database for long-term storage, audit logging, incident reporting, and the threat knowledge library. Detection rules, PII patterns, content policies, and the growing threat knowledge base are all live Atlas collections — agents read and write them as first-class operations, making continuous threat knowledge building possible end to end.
  • Voyage AI 4 generates 1024-dimensional embeddings for the threat knowledge library, enabling semantic duplicate detection (cosine similarity threshold > 0.90) and hallucination checking via Atlas Vector Search. Voyage AI 4 is the current state-of-the-art embedding model recommended by MongoDB for Atlas Vector Search — chosen specifically for its retrieval accuracy on short, domain-specific security phrases like attack indicators and threat descriptions.

The result is a comprehensive pipeline: Tier 1 semantic vector search against the self-building threat knowledge libraryTier 2 bad-actor intention detection with a knowledge development agentreview, logging, incident reporting, and continuous knowledge development — fully orchestrated by Gemini and MongoDB.

Challenges we ran into

  • First time working with MongoDB. This project was our first hands-on experience with MongoDB Atlas. Learning the Atlas data model, setting up Vector Search indexes, understanding the MCP tooling, and wiring pymongo directly into an async agent system — all at the same time as building the security logic — added real complexity to the timeline. Getting the vector search index configured correctly for Voyage AI 4 embeddings and understanding how Atlas Search complements vector similarity took iteration.
  • Reliable knowledge persistence. Getting agents to write durable, structured knowledge and audit logs to MongoDB — with correct request correlation across asynchronous, threaded execution — surfaced subtle ContextVar propagation bugs we had to track down and fix. The solution was explicit request_id passing rather than implicit context propagation.
  • Making the knowledge library truly self-building. Designing an agent that decides on its own what's worth learning, checks for duplicates via vector search before inserting, and persists it without human curation was as much a prompt-engineering challenge as an architectural one.

Accomplishments that we're proud of

  • A working, end-to-end agentic LLM security guardian covering the full OWASP Top 10 for LLM Applications.
  • A two-tier detection architecture that intelligently trades speed for depth — fast screening for everything, deep reasoning where it matters.
  • A genuinely self-motivated, continuously growing threat knowledge library, built and maintained autonomously by Gemini and MongoDB.
  • Treating AI governance as a complete lifecycle — protection, monitoring, and review — rather than a single bolt-on filter.
  • Full audit logging, incident reporting, and OWASP category mapping, making the system usable for real governance, not just demos.
  • A fast build enabled by ADK and Agent Garden — Cyber Guardian gave us the composition skeleton; we built a production-grade domain system on top in a hackathon window.

What we learned

  • Orchestration model quality matters more than you think. We tested both gemini-3.1-flash-lite and gemini-3.5-flash as the orchestrator. The quality difference was significant — in both pipeline coordination and the coherence of the final output. A capable model at the orchestration layer isn't a luxury; it's what holds a multi-agent system together.
  • Reference samples accelerate serious builds. Google's Cyber Guardian sample from Agent Garden didn't just teach us ADK — it gave us a production-ready structural pattern we could confidently build on and diverge from with purpose.
  • Tiering models by task pays off. Pairing a fast, lightweight model with a smarter, deeper one gives both performance and accuracy, instead of compromising on one.
  • MongoDB Atlas is robust, easy to set up, and RAG-ready out of the box. Despite being our first time with MongoDB, the Atlas free tier (M0) was up and running in minutes. The built-in Vector Search, Atlas Search full-text indexing, and the document model made it a natural fit for an AI security system — storing detection rules, embeddings, audit logs, and incident reports all in one place without a separate vector database. Atlas is genuinely RAG-friendly from day one.
  • MongoDB Atlas autoEmbed eliminates the embedding pre-computation step. We initially built the threat knowledge pipeline with a separate Voyage AI call to generate embeddings before every vector search and insert — a two-step flow that broke down inside ADK's batched tool execution model. Switching to Atlas autoEmbed (configured with Voyage AI 4 on the indicator field) solved this cleanly: agents send plain text to the MCP aggregate call, and Atlas handles the embedding internally. Inserts also work without pre-computed vectors — Atlas generates and indexes the embedding at write time. This is a significant simplification for any agentic system that needs semantic search: one MCP tool call does everything.
  • MongoDB MCP semantic search is genuinely useful in agentic pipelines. The MongoDB-provided MCP server integrates directly with the ADK tool system. Agents call aggregate with a $vectorSearch pipeline as a first-class tool — no custom HTTP wrappers, no embedding client, no boilerplate. The detection agent uses this on every scan to search the self-building threat knowledge library; the knowledge developer uses it for duplicate detection before every insert. Seeing an agent autonomously decide to search a vector database, interpret the results, and act on them — all via a single MCP tool call — confirmed that MongoDB MCP + ADK is a real production integration pattern, not just a demo.
  • Google ADK evolved dramatically from 1.3.x to 2.1.0. We experienced this firsthand — ADK 1.3.x had rough edges: the Dev UI was minimal and offered little insight into what was happening inside a multi-agent run. ADK 2.1.0 is a significant leap in observability and developer experience: the Dev UI gained agent traces, tool call inspection, and session state visibility that made debugging the pipeline far faster. If you started on an earlier ADK version and gave up, 2.1.0 is worth another look.
  • ADK's agent trace is an essential troubleshooting tool. When building a multi-agent pipeline, it's rarely obvious whether a tool was called, in what order, or what it returned — especially when the orchestrator is routing across six sub-agents. The ADK Dev UI trace shows every tool invocation, its inputs, its output, and which LLM turn triggered it. This was critical for us when debugging why the detection agent wasn't calling the MongoDB MCP semantic search tool: the trace revealed that the embedding step and the MCP aggregate call were landing in different LLM batches, with the first batch concluding "no risk" before the second batch's results were available. Without the trace, that failure mode would have been nearly impossible to diagnose from logs alone.
  • Knowledge is the moat. A guardian that continuously builds its own threat knowledge gets better the longer it runs — turning security from a static ruleset into a living, improving asset.

What's next for LLM Guardian Agent

  • Broader real-time integrations — drop-in middleware and gateway support so any LLM application can be protected with minimal effort.
  • Richer governance dashboards — surfacing trends, incident analytics, and compliance reporting on top of the audit log.
  • Smarter autonomous knowledge curation — letting the knowledge development agent rank, deduplicate, and validate learned threats, and share intelligence across deployments.
  • Expanded coverage — adapting detection as the OWASP LLM Top 10 and the broader threat landscape evolve.
  • Community and ecosystem — opening the threat knowledge framework so the wider AI community can contribute to and benefit from a shared, ever-growing defense.

Built With

  • gemini-3.1-flash-lite
  • gemini-3.5-flash
  • googleadk
  • mongodb-altas
  • voyage4
Share this project:

Updates