What it does

NEX is an autonomous purple-team for Splunk. It runs one closed loop against your data: recon → attack-think → prove → skeptic → ship. It enumerates your real detections, uses a security-tuned LLM to pick the highest-impact ATT&CK technique that's present but uncovered, proves the gap against live telemetry (e.g. 301 S3-exfil events, 0 detections), sanity-checks itself to kill false positives, then runs SPL against live Splunk (REST/MCP); sandbox emulates the data plane*writes and deploys an SPL + Sigma detection* as a real Splunk saved search. Coverage flips 0→1 — the blind spot closes.

Inspiration

Every SOC runs on detections, and the dangerous part is what they don't watch for if no rule covers a technique; nothing alerts and the attack walks. Those gaps are invisible by definition. We wanted an agent that thinks like a bug-bounty hunter (recon → prove → report) but goes one step further: it ships the fix.

How I built it

React + Vite + React Flow dashboard streaming a Python/Fast API agent loop over SSE. The brain is Foundation-Sec-8B (Cisco's security-tuned LLM) running locally via Ollama. The agent is decoupled from Splunk behind six tools, so the same loop runs against the Splunk MCP Server, the REST API, or a bundled mock swap one config value. Detections are emitted as SPL + Sigma, mapped to MITRE ATT&CK.

Challenges we ran into

Small-model reliability — we added grounding guards (impact-ranked selection, a citation-checked skeptic, an on-target SPL validator) plus a deterministic safety net so the agent is genuinely model-driven and can't hallucinate. We also separated proof from fix: the gap is proven by the technique's real telemetry presence, not the model's candidate SPL.

What I learned

LLM security agents earn trust by being grounded — the model reasons, but deterministic checks gate every consequential step against ground truth.

What's next

Live MCP once KV Store is healthy; an SPL command allowlists so the agent can't run a destructive search; multi-gap sweeps and continuous scheduling.

Built With

Share this project:

Updates