🛡️ AgentArmor — Immune System for AI Agents

Tagline: Attack one, defend all — bio-inspired collective immunity for AI agents.

💡 Inspiration

The explosion of autonomous AI agents — browsing the web, managing emails, reviewing code, analyzing data — has created a massive blind spot in cybersecurity. Traditional security tools (firewalls, WAFs, antivirus) were designed for human-computer interaction. But what happens when AI is the target?

In early 2025, researchers demonstrated that a single prompt injection hidden in a web page could hijack an AI agent into leaking private emails, executing malicious code, or exfiltrating sensitive data. There is no standardized security framework for protecting AI agents.

That's when it clicked: the human body already solved this problem. Our immune system doesn't rely on a single wall — it uses layered defense:

Skin (barrier defense) → blocks most pathogens on contact
Innate immunity (behavioral response) → detects anomalous behavior patterns
Adaptive immunity (memory + antibodies) → learns from attacks and propagates defense to the entire body

What if we could give AI agents the same biological defense? That's AgentArmor.

🔧 What It Does

AgentArmor is a three-layer biological immune system for AI agents:

Layer 1: Injection Firewall (The Skin)

Every input is scanned by 5 parallel detection strategies before it reaches any agent:

Strategy	What It Catches
Pattern Matching	30+ known injection phrases ("ignore previous", "you are now DAN")
Encoding Detection	Base64, Unicode tricks, HTML entities, hex-encoded payloads
Structural Analysis	Imperative commands, role reassignment, context splitting
Entropy Analysis	Obfuscated/encrypted payloads via Shannon entropy
Zero-Width Detection	Steganographic attacks using invisible Unicode characters

The risk score is computed as a weighted combination:

$$R = \min\left(100,\; \sum_{i=1}^{n} w_i \cdot c_i \cdot 100\right)$$

where $w_i$ is the strategy weight and $c_i$ is the confidence score from each matched strategy. Detection latency: < 2ms.

Layer 2: Behavioral Immune System (Innate Immunity)

Each agent builds a 6-dimensional behavioral fingerprint:

$$\vec{B} = \left( f_{api},\; t_{resp},\; d_{action},\; r_{error},\; a_{resource},\; v_{data} \right)$$

The anomaly detector computes weighted deviation scores:

$$A = \frac{\sum_{j=1}^{6} w_j \cdot \delta_j}{\sum_{j=1}^{6} w_j}, \quad \delta_j = \frac{|x_j - \mu_j|}{\sigma_j + \epsilon}$$

When $A > 0.3$: SUSPICIOUS. When $A > 0.6$: QUARANTINED. Agents self-heal when behavior normalizes.

Layer 3: Collective Immunity (Adaptive Immunity)

A Honeypot Agent mimics vulnerability to attract attackers, captures their techniques, and generates defense signatures
Immune Memory stores all attack signatures and propagates them to every agent in the network
Result: Attack one agent → all agents become immune

🏗️ How I Built It

Architecture

Backend: Python FastAPI with 5 security engines, WebSocket real-time streaming, and a 6-agent simulation engine
Frontend: React 18 + Vite 5 with a custom cybersecurity dark theme (1,600+ lines of CSS), Canvas API visualizations, and animated components
Zero external APIs: All detection runs locally using heuristic engines and statistical analysis — no API keys, no cost, no rate limits

Key Technical Decisions

Heuristic-first detection — Instead of relying on expensive LLM-based classification, I built a multi-strategy pattern engine that runs in < 2ms. This means the security layer adds virtually zero latency to agent processing.
Biological metaphor as architecture — The three-layer model isn't just branding. Each layer operates independently with clear inputs/outputs, making the system modular and extensible. New detection strategies plug into Layer 1. New behavioral dimensions plug into Layer 2. New propagation methods plug into Layer 3.
Simulation-driven demo — Rather than connecting to real AI services (which would require API keys and add failure points), I built a full simulation engine that generates realistic agent telemetry and periodic attacks. This makes the demo self-contained and reproducible.
WebSocket for real-time — The threat feed, agent status updates, and immune propagation events all stream via WebSocket, making the dashboard feel alive and responsive.

Tech Stack

Component	Technology
Backend	Python 3.11, FastAPI, Uvicorn, Pydantic
Frontend	React 18, Vite 5, Canvas API
Real-time	WebSocket (native)
Deployment	Vercel (frontend) + Render (backend)
Design	Custom CSS with glassmorphism, micro-animations

🧗 Challenges I Faced

1. False Positive Calibration

The hardest part wasn't catching attacks — it was not catching legitimate inputs. My first version of the pattern matcher flagged "Can you help me ignore this error?" as an injection because it contained the word "ignore." I had to redesign the pattern engine to use phrase-level matching with context windows instead of single-keyword detection. The result: zero false positives on clean inputs.

2. Behavioral Baseline Cold Start

When an agent first connects, there's no behavioral baseline. I solved this by generating synthetic baselines with randomized-but-realistic initial values, then letting the behavioral engine converge to the real baseline through exponential moving averages.

3. Render Deployment + Python 3.14

Render defaulted to Python 3.14, which had no pre-built wheels for pydantic-core. The build failed trying to compile Rust from source on a read-only filesystem. I fixed this by pinning Python 3.11 via .python-version and relaxing dependency pins from == to >=.

4. Real-Time UI Without External Libraries

I wanted the dashboard to feel premium without heavy chart libraries. I built all visualizations — radar chart, risk gauge, network graph, security pipeline — using the Canvas API directly. This kept the bundle tiny but required significant manual coordinate math.

🏆 Accomplishments I'm Proud Of

Sub-2ms detection latency — The injection firewall is faster than a human blink
Zero false positives — Clean inputs always pass through cleanly
Fully interactive demo — Judges can type ANY attack and see it detected live
Biological architecture — Not just a metaphor, but a genuinely modular three-layer system
Solo build — Architecture, 5 backend engines, 14 React components, 1,600+ lines of CSS, deployment, all built by one person
Zero external API dependencies — No keys, no cost, works offline

📚 What I Learned

Biology is an incredible teacher for systems design. The human immune system has been refined over millions of years of evolution. Its layered defense, behavioral monitoring, and collective memory patterns map beautifully to digital security challenges.
Heuristics can be surprisingly powerful. I initially assumed I'd need an LLM for injection detection. But a well-designed multi-strategy heuristic engine with 30+ patterns, encoding analysis, and entropy scoring achieves > 90% detection rate at 1000x lower latency and zero cost.
The demo IS the product at a hackathon. I spent significant time on micro-animations, the guided tour, and expandable cards — features that don't affect core functionality but massively improve the judge experience. Every second of a demo counts.
Deployment is its own challenge. What works on localhost:8000 doesn't always work on Render. Python version pinning, CORS configuration, WebSocket proxying — these "boring" problems consumed real time.

🔮 What's Next

Azure OpenAI Integration — Add semantic-level injection detection using LLM embeddings for even higher accuracy
Real Agent Connectors — Plugin SDK for LangChain, AutoGen, CrewAI, and Semantic Kernel agents
Production Immune Network — Distributed signature propagation across organizations (like virus definition updates)
Attack Signature Marketplace — Community-shared defense signatures, similar to antivirus databases
Compliance Dashboard — OWASP Top 10 for LLMs alignment scoring and audit trails