Inspiration
Security teams are drowning in log noise, and by the time a human spots a brute-force attack, the damage window has often already passed. We wanted to see what happens when you give an LLM not just visibility into that noise, but the ability to act — read the logs, make a judgment call, write the fix, and loop in a human only when needed. breach-bot is our answer: an AI security analyst that doesn't just alert, it remediates.
What It Does
breach-bot runs a closed-loop security pipeline. It pulls live authentication logs from Splunk via REST API, sends them to Claude, which acts as an automated analyst — returning a structured verdict with confidence score, severity, and a recommended fix. If Claude is confident (≥85%), breach-bot automatically opens a GitHub PR containing the firewall rule fix and posts a formatted incident report to Slack with a link to that PR. If confidence is lower, it still alerts the team in Slack — but flags it for human review instead of auto-remediating. If there's no attack at all, it stays silent.
How We Built It
We built incrementally, proving each integration standalone before wiring them together: first the Claude API core (with robust JSON-output parsing), then Slack alerting, then GitHub PR automation via PyGithub. Once those three worked end-to-end against fake log data, we tackled the hardest piece — Splunk. We initially attempted integration via the official Splunk MCP server, then pivoted to a direct REST API client (more on that below). Finally, we built out three test scenarios — a clear attack, an ambiguous low-confidence case, and normal traffic — to demonstrate all three branches of breach-bot's decision logic, selectable via a --index CLI flag.
Challenges We Ran Into
The biggest challenge was the Splunk MCP server itself. We hit a Python version requirement (3.12+), then a FastMCP API mismatch with the installed mcp SDK, then discovered the server's own pyproject.toml required mcp>=1.3.0 for one fix but that version broke a different part of its initialization — an unresolvable internal version conflict in a third-party tool. After investing real time, we pivoted to calling Splunk's REST API directly with bearer token auth — same end result, more reliable. Smaller but real debugging moments included: a stray = character silently corrupting a token in .env, a GitHub 404 error that was actually a disguised permissions issue, and a subtle bug where our fake log timestamps "aged out" of Splunk's default -24h search window overnight.
Accomplishments That We're Proud Of
We're proud that breach-bot is a genuinely working, end-to-end pipeline across four real services — not a mockup. We're especially proud of the resilient fallback design: if Splunk is unreachable, breach-bot gracefully degrades to local sample data rather than crashing the demo. And we're proud of the three-tier confidence logic — it shows AI making a nuanced decision (act / flag / ignore) rather than a blunt binary alert.
What We Learned
The MCP ecosystem is exciting but still young — third-party server implementations can have brittle, conflicting dependencies, and sometimes a direct API integration is the more reliable choice on a hackathon timeline. We also learned hands-on techniques for getting reliable structured output from Claude (JSON-only prompting with fallback parsing), and the importance of designing security automation with confidence thresholds so AI acts decisively only when it should — and defers to humans when it shouldn't.
What's Next for breach-bot
- Cloudflare integration — extend remediation to live account-level IP firewall rules, not just GitHub PRs
- True autonomous loop — a scheduler that runs breach-bot every 30 minutes without manual triggering
- Broader threat detection — port scans, data exfiltration patterns, anomalous access times, not just brute force
- Multi-source correlation — combine Splunk auth logs with other data sources for richer context Incident history dashboard — visualize breach-bot's past detections and actions over time
Log in or sign up for Devpost to join the conversation.