Inspiration

Most SOAR platforms cost six figures, take months to deploy, and still require a dedicated team to maintain. I wanted to build something a single analyst could stand up in minutes and immediately use to investigate real alerts. The Splunk MCP Server made this possible: a clean, standardized interface between AI agents and Splunk data without any direct SDK calls or hardcoded queries.

What it does

TriagaSOAR is a full agentic SOAR platform that runs entirely in Docker. Submit an alert, and three AI agents go to work.

A router agent (Qwen3 1.7B) classifies the alert and selects an initial SPL query. A primary agent (Qwen3 14B) runs a multi-step pivot loop. It executes SPL via the Splunk MCP Server, reads the results, scores each finding, and derives the next query from what it discovered. IP to user to process to timeline, each pivot is a live decision. An adversarial agent (Qwen3 14B) then challenges the conclusions. If it finds weaknesses in the reasoning, the primary agent reinvestigates with the critique as context.

The final report includes MITRE ATT&CK technique mapping, blast radius estimation, AbuseIPDB threat intelligence enrichment, a confidence score, kill chain summary, and remediation recommendations. Exports as PDF or MITRE Navigator JSON layer.

Beyond investigation: a YAML playbook engine fires automatically after every investigation, cross-platform identity correlation searches Entra ID, Okta, and Auth0 simultaneously, Velociraptor endpoint hunting dispatches artifact collections automatically when suspicious hosts are identified, and every destructive response action requires a Single-Action Token with a written reason, 60-second TTL, and leaves a hash-chained audit trail the application itself cannot modify.

How I built it

A Rust/Axum auth proxy sits in front of everything. Sessions are Argon2id-hashed, IP and user-agent bound, with a hash-chained audit log in a separate Postgres schema using an INSERT-only database role. Admin credentials live in Docker secrets, never in environment variables.

The investigation agent is Python/FastAPI with three Qwen3 models on Ollama. The Splunk MCP Server is the only interface to Splunk data. Every SPL query is generated by the LLM and validated before execution. User input is isolated in a separate message to prevent prompt injection, and generated SPL is checked against a dangerous command blocklist before it runs.

I also containerized Splunk Enterprise with the MCP Server pre-baked, synthetic attack data pre-seeded, and full zero-touch setup via make up-splunk. No Splunk license or external dependencies required to run a real investigation.

Challenges

Splunk's MCP Server uses encrypted tokens that only work through the MCP endpoint, not the standard REST API. The gRPC TLS hostname verification in Velociraptor blocked direct connections from soc-agent, so I routed through docker exec instead. The Postgres init scripts failed silently on clean volume starts because NOW() in a partial index predicate is not immutable. Found that at midnight before the deadline during a stress test.

Getting ScubaGear running on Linux required patching out [Security.Principal.WindowsPrincipal] calls at build time with a Python script, symlinking the OPA binary, and working around a .NET X509Store limitation for certificate auth. ScubaGear has never had official Linux or Docker support — it requires Windows by design. The working Dockerfile and patch script are in the repo, and I filed the proof of concept upstream to the CISA ScubaGear project: #2211.

Accomplishments

A fully working agentic SOAR platform built in a hackathon window. The adversarial agent debate architecture is something I haven't seen in other SOC tools. The hash-chained audit log with an INSERT-only Postgres role means the system genuinely cannot tamper with its own evidence. Anyone can run a real investigation in under ten minutes with no prior setup.

What I learned

Building production-grade auth from scratch in Rust. The MCP protocol surface area. How Velociraptor's gRPC API works under the hood. That Postgres partial indexes cannot use non-immutable functions.

What's next

Wazuh integration for host-based detection. A threat hunting interface built on the pattern library. Multi-tenancy for team deployments. Full investigation pipelines defined as YAML.

Python, FastAPI, Rust, Axum, Astro, React, TypeScript, PostgreSQL, SQLite, Ollama, Qwen3, Splunk Enterprise, Splunk MCP Server, Velociraptor, Docker, Microsoft Graph API, Okta API, Auth0 API, AbuseIPDB, Argon2id, WeasyPrint

Built With

Share this project:

Updates