The Challenge
Cyber attackers have a fundamental advantage: they only need to succeed once. Defenders must catch every intrusion, every time. Traditional signature-based tools fail against novel threats, and single-layer anomaly detectors are easy to evade — a patient attacker who mimics normal behavior can slip past them at a 62% miss rate.
We asked: what if a network defense system worked more like a biological immune system?
The Idea
Your immune system doesn't rely on a single mechanism. It has innate immunity for fast, obvious threats. It has adaptive memory that learns from past exposure. It has deception — luring pathogens into traps. It coordinates all of these layers simultaneously, and it gets smarter over time.
SOMA (Signaling-Optimal Memory Architecture) is that idea applied to cybersecurity. It's a multi-layer autonomous defense system that learns what a healthy network looks like, detects deviations at multiple timescales, and actively deceives attackers into honeypot containers — all without a signatures database or human in the loop.
Why This Matters
Low-and-slow attackers are the most dangerous. They spread activity over weeks, stay below alert thresholds, and mimic legitimate traffic. SOMA's Long-Dwell layer tracks behavioral drift over 100-episode rolling windows — detecting attacks that have been silently building for days that single-step detectors miss entirely. The system caught sophisticated evasion attacks at 100% detection, while maintaining a 0% false positive rate.
The Demo
We built a live end-to-end attack scenario: a real email triggers a malware download. The payload spawns CPU-burning worker processes. SOMA reads real system telemetry (via psutil), crosses its anomaly threshold, and — instead of simply blocking — spins up a Docker honeypot container and silently redirects the malware. The attacker thinks they're still connected to the victim machine. The dashboard turns red, then green, then shows a live honeypot feed. A single "Purge & Destroy" button kills the container and clears the scene.
The Research Grounding
The deception layer is grounded in signaling game theory. We implemented a Perfect Bayesian Equilibrium (PBE) solver and trained a PPO reinforcement learning agent to recover the theoretically optimal mixing strategy — validated against closed-form benchmarks across three attention-cost regimes (κ ∈ {0, V/2, V}). The RL policy converges to the predicted equilibrium, not just a heuristic approximation.
Log in or sign up for Devpost to join the conversation.