EvidenceLoop SIFT

Autonomous DFIR triage with evidence-first reasoning and auditable self-correction.

EvidenceLoop SIFT is an autonomous DFIR triage agent for SANS SIFT Workstation that plans an investigation, runs safe forensic tools, and continuously updates its reasoning as new evidence comes in. It builds a hypothesis ledger, detects contradictions, and produces an analyst-ready report that clearly separates confirmed facts from inferred claims and open questions. Every conclusion is tied back to exact tool output and execution IDs, so the final result is fully auditable instead of being a black box.

Inspiration

We built EvidenceLoop SIFT because incident responders are still expected to move fast with too many tabs, too many tools, and not enough time. In a world where attackers can pivot in minutes, we wanted to explore what it looks like to give defenders an AI partner that reasons like a senior analyst instead of just firing off commands. The core idea was simple: if an agent is going to help in real incidents, it has to be evidence-first, transparent, and willing to admit when it was wrong.

How we built it

We split the system into a Next.js frontend and a Python FastAPI backend, with LangGraph driving the investigation state machine. A custom MCP server exposes only approved DFIR actions through typed, structured JSON responses, which keeps the agent inside safe boundaries while still letting it operate autonomously. We used SQLite and Prisma for persistence, SSE for live progress updates, and a clean dashboard to show the investigation plan, evidence timeline, contradictions, and guardrails in real time.

Challenges

The hardest part was getting autonomy without losing trust. We had to design the agent so it could explore, but only through constrained tools and explicit evidence checks, and then make sure it would actually revise its beliefs when the data disagreed. Another challenge was making the output readable for humans while still preserving enough structure and metadata to be defensible in an incident response workflow.

Accomplishments

We’re proud that the agent doesn’t just summarize evidence, it reasons over it and corrects itself when needed. The live demo case is especially satisfying because it starts with a plausible false lead, then shows the system pivoting as contradictions appear, which makes the whole loop feel real instead of scripted. We also built a report format that feels useful to a responder on day one, not just impressive in a demo.

What we learned

We learned that in security, speed is important, but trust is everything. The best AI workflow is not the one that sounds most confident; it’s the one that can show its work, track uncertainty, and recover gracefully when it gets something wrong. We also learned how much value there is in constraining an agent with typed tools and explicit states instead of letting it improvise freely.

What’s next

Next, we want to expand the safe tool coverage, add more artifact parsers, and support richer case types beyond the curated demo dataset. We’d also like to improve the scoring and prioritization logic so the agent can better decide what to investigate first under time pressure. Longer term, we want this to become a practical open-source assistant that real responders can trust during active incidents.

Built With

  • docker
  • fastapi
  • langgraph
  • lucide-react
  • mcp-(model-context-protocol)
  • next.js-14
  • prisma
  • pydantic
  • python-3.11
  • recharts
  • server-sent-events
  • shadcn/ui
  • sqlite
  • tailwind-css
  • tanstack-query
  • typescript
  • zod
  • zustand
Share this project:

Updates