The Challenge

Cyber attackers have a fundamental advantage: they only need to succeed once. Defenders must catch every intrusion, every time. Traditional signature-based tools fail against novel threats, and single-layer anomaly detectors are easy to evade — a patient attacker who mimics normal behavior can slip past them at a 62% miss rate.

We asked: what if a network defense system worked more like a biological immune system?

The Idea

Your immune system doesn't rely on a single mechanism. It has innate immunity for fast, obvious threats. It has adaptive memory that learns from past exposure. It has deception — luring pathogens into traps. It coordinates all of these layers simultaneously, and it gets smarter over time.

SOMA (Signaling-Optimal Memory Architecture) is that idea applied to cybersecurity. It's a multi-layer autonomous defense system that learns what a healthy network looks like, detects deviations at multiple timescales, and actively deceives attackers into honeypot containers — all without a signatures database or human in the loop.

Why This Matters

Low-and-slow attackers are the most dangerous. They spread activity over weeks, stay below alert thresholds, and mimic legitimate traffic. SOMA's Long-Dwell layer tracks behavioral drift over 100-episode rolling windows — detecting attacks that have been silently building for days that single-step detectors miss entirely. The system caught sophisticated evasion attacks at 100% detection, while maintaining a 0% false positive rate.

The Demo

We built a live end-to-end attack scenario: a real email triggers a malware download. The payload spawns CPU-burning worker processes. SOMA reads real system telemetry (via psutil), crosses its anomaly threshold, and — instead of simply blocking — spins up a Docker honeypot container and silently redirects the malware. The attacker thinks they're still connected to the victim machine. The dashboard turns red, then green, then shows a live honeypot feed. A single "Purge & Destroy" button kills the container and clears the scene.

The Research Grounding

The deception layer is grounded in signaling game theory. We implemented a Perfect Bayesian Equilibrium (PBE) solver and trained a PPO reinforcement learning agent to recover the theoretically optimal mixing strategy — validated against closed-form benchmarks across three attention-cost regimes (κ ∈ {0, V/2, V}). The RL policy converges to the predicted equilibrium, not just a heuristic approximation.

Share this project:

Updates