Sentinel Zero

Inspiration

Modern attackers use AI-driven malware that can move from initial access to full domain takeover in under 8 minutes — completely automated. Meanwhile, enterprise security teams are drowning in 500–5,000 alerts per day and can realistically review only 50–100. Human defenders are fighting at human speed against machine-speed threats.

This project is our unified submission to both the Splunk App Development Hackathon (autonomous SIEM triage) and the Finding Evil: Cybersecurity Hackathon (autonomous SIFT forensic investigation) — two different threat-response use cases powered by the exact same architecture.

What it does

Sentinel Zero is an autonomous AI incident-response agent powered by Google Gemini 2.5 Flash and the Model Context Protocol (MCP). It operates in two modes:

🔴 Splunk Mode (Splunk Hackathon Track)

Ingests live Splunk SIEM alerts and ranks them by severity (CRITICAL / HIGH / MEDIUM)
Autonomously triages the selected alert through a 5-iteration self-correcting agent loop
Real alert tested: "Unusual Volume Shadow Copy Deletion (vssadmin.exe)" — CRITICAL
Streams every reasoning step live to the analyst dashboard via Server-Sent Events

🟢 SIFT Forensics Mode (Finding Evil Hackathon Track)

Connects to a custom-built FastMCP server with read-only SIFT forensic tools
Loads real forensic targets: SEC-PROD-SRV01_disk.raw (45 GB) + SEC-PROD-SRV01_memory.dmp (16 GB)
Runs objective: audit filesystem for unauthorized executables + check for process hollowing
Detects and eliminates hallucinations through an independent Self-Correction auditor
Generates a complete Incident Response Runbook (Incident ID: IR-APM-20231027-001)

Both modes share:

Real-time SSE streaming of agent reasoning to the UI (no black box — full transparency)
Self-Correction engine: independent Gemini call that catches and removes unsupported claims
Multi-key API rotation: pool of 10 Gemini API keys with exponential backoff on 429 errors
Demonstrated in live session: 2 key rotations per run, 5 hallucinations caught per run

How we built it

Architecture: A dual-mode MCP client/server system with a self-correcting agentic loop.

Agent Core (core/agent.py): Gemini 2.5 Flash runs an autonomous loop (max 5 iterations), calling MCP tools, collecting tool outputs, and building confidence-scored findings on each pass.
MCP Layer:
- Splunk Mode — integrates with Splunk's official MCP Server for live alert triage
- SIFT Mode — connects to our custom sift_mcp_server (FastMCP), exposing read-only forensic tool wrappers (fls, volatility3, grep) as native Python functions
Self-Correction Engine (core/self_correct.py): An independent second Gemini call that acts as a forensic auditor. It compares every proposed finding against raw tool outputs and explicitly flags any claim not backed by hard evidence. Demonstrated live: caught 5 hallucinations per session, drove confidence to 0%.
Multi-Key API Resilience: Pool of 10 Gemini API keys rotates automatically on 429 RESOURCE_EXHAUSTED errors. Demonstrated live: key rotations at 5:44:55 AM, 5:45:24 AM (Splunk run) and 5:46:22 AM, 5:46:48 AM (SIFT run).
Frontend: Premium glassmorphic cyberpunk dashboard — vanilla HTML/CSS/JS with 400vh scroll-driven storytelling, requestAnimationFrame canvas animations, and lazy SSE connections.
Backend: FastAPI with SSE streaming. Deployed on Vercel + Hugging Face Spaces.

Challenges we ran into

Gemini API quota exhaustion mid-investigation: Built a dynamic key rotation pool that catches 429 errors in real time and resumes the exact same request. Demonstrated live with 4 key rotation events across 2 sessions.
LLM hallucination in security context: Built a dedicated SelfCorrector class as a second independent Gemini call. In every live test, the auditor correctly flagged and discarded hallucinated findings, driving confidence to 0% until real evidence existed.
Preventing destructive tool calls: Used MCP's tool schema to architecturally restrict the agent — the AI physically cannot invoke any command that modifies forensic data.
Vercel 10-second serverless timeout: Agent loop runs 15–90 seconds. Solved with graceful SSE error messaging and local-run mode (localhost:8001).

Accomplishments that we're proud of

✅ Self-correction actually works — 5 hallucinations caught and rejected per session
✅ Zero-hallucination architecture — Confidence: 0% is the correct output when evidence is absent. The system refuses to lie.
✅ 10-key API resilience — 4 live key rotation events demonstrated across two sessions
✅ Evidence integrity guaranteed — read-only MCP toolchain; agent cannot alter evidence
✅ Full audit trail — every tool call logged with timestamps to execution_log.json
✅ Complete IR Runbook generated — IR-APM-20231027-001 with 3-phase remediation plan
✅ Dual hackathon architecture — one codebase, one deployment, two winning tracks

What we learned

MCP is the future of agentic security tooling. A sandboxed tool schema enforces constraints that prompts alone cannot. The tool schema is your safety net.
Autonomous loops need explicit safety caps. max_iterations = 5 is not optional. Without it, unconstrained agents spiral into compounding hallucinations.
Self-correction requires full independence. A second, completely separate Gemini call with explicit auditor instructions catches real errors that self-review misses.
Streaming UX (SSE) builds trust. When judges and operators watch every reasoning step arrive live — including the corrections — trust increases dramatically.
Resilience is a first-class feature. API rate limits are not edge cases for autonomous agents. Multi-key rotation and exponential backoff must be in the core.

Built With

alienvault-otx
css3
fastapi
fastmcp
gemini-2.5-flash
glassmorphism
google-gemini
html5
hugging-face-spaces
javascript
mcp
model-context-protocol
python
sans-sift
scrollytelling
server-sent-events
sleuthkit
splunk
sse
uvicorn
vercel
volatility3

Submitted to

Splunk Agentic Ops Hackathon

Created by

I made the whole project alone with the help of Google AI studios for faster code writings , research purposes and report writings.
In this project while building it I faced many issues like - so much AI quota usage while resolving errors and when testing the project - The Gemini API key gets exhausted rapidly, so I integrated 10 API keys used in a ordered sequence to handle the issue.
I've learnt a lot and worked hard to create and overcomes all challenges and made it done , I'm thankful to the judges for making these hackathons cause it helps youngsters grow , learn and adapt with modern innovations faster.

Kushal Soni
A CS student with interest in cybersecurity & AI. Looking for an opportunity where I can contribute and keep growing.

Updates

Kushal Soni started this project — Jun 15, 2026 11:14 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.