Inspiration
Human incident responders need hours to triage a disk image. AI-powered attackers move in minutes. The FIND EVIL hackathon asks us to close that gap with autonomous agents on the SANS SIFT toolchain.
I built SIFT-Analyst so an analyst can mount real forensic evidence safely, detect evil automatically, and produce a traceable IR report without babysitting every command.
What it does
SIFT-Analyst autonomously analyzes the M57 Patents Terry USB forensic image (terry-work-usb-2009-12-11.E01).
It verifies SHA-256 evidence integrity, mounts the E01 read-only, runs YARA malware rules and Sleuthkit timeline extraction, ML pre-filters filesystem anomalies with Isolation Forest, self-corrects on tool failures, and produces six artifact-backed findings with timestamped audit logs.
Verified findings on M57 ground truth:
F-001 CRITICAL: XP Advanced Keylogger V2.1 (R54402.EXE) F-002 CRITICAL: Stolen credentials m57admin / admin01 F-003 HIGH: VNC 4.1.3 remote access F-004 HIGH: patentauto.py and webauto.py automation scripts F-005 HIGH: 2,471 surveillance screenshots (Dec 3-7 2009) F-006 MEDIUM: USB physically connected to Pat's workstation
How we built it
Architecture: Custom MCP Server + Claude Code (single-agent) + LangGraph (multi-agent)
Custom FastMCP server (mcp_server/sift_analyst_mcp.py) exposes 8 typed read-only forensic tools. No generic shell access. The agent cannot call destructive commands that are not registered.
Claude Code handles autonomous single-agent IR with MCP tool calls. Self-corrections are logged in analysis/forensic_audit.log.
LangGraph runs a three-agent pipeline: Triage (YARA + ML anomaly detection), Analysis (keylog artifacts and credentials), Report (IR markdown generation). All agent messages are timestamped in analysis/multi_agent_execution.log.
Architectural guardrails: OS-level read-only mount via ewfmount and mount -o ro. Spoliation test confirms writes are blocked. All outputs go to exports/ and reports/ only.
ML: Isolation Forest pre-filter on filesystem timeline (exports/fs_timeline.csv) flags anomalous activity before deep LLM analysis.
Forensic toolchain: Sleuthkit, EWF Tools, YARA on Ubuntu 26.04 EC2 (SIFT-equivalent tools).
Challenges we ran into
Permission errors on ewfmount required autonomous self-correction (retry with sudo). This is logged as SELF-CORRECTION-1 in forensic_audit.log.
Write attempts to evidence paths fail on read-only mount. The agent redirects output to exports/ instead (SELF-CORRECTION-2).
Keeping every finding traceable to a specific tool execution required structured agent logs with timestamps, not free-form LLM output.
Scope honesty: this submission analyzes a USB disk image only. Memory forensics and Windows registry analysis are documented as future work in reports/accuracy-report.md.
Accomplishments that we're proud of
Zero hallucinations on six findings. All verified against actual artifacts on the E01 image (see reports/accuracy-report.md).
15 YARA matches on live mounted evidence including XP Advanced Keylogger, Patent Theft Scripts, and VNC Remote Access rules.
239 ML timeline anomalies flagged by Isolation Forest pre-filter.
Spoliation protection test passes: write_blocked true on read-only evidence mount.
Three documented autonomous self-corrections in forensic_audit.log.
Full judge reproducibility: clone repo, run python3 multi_agent/sift_multi_agent.py, see YARA matches and IR report without downloading the E01 (bundled exports/files/ included).
What we learned
Purpose-built MCP servers beat prompt-only guardrails for evidence protection. If the destructive tool does not exist in the MCP server, the agent cannot run it regardless of what the model tries.
Timestamped execution logs are as important as the findings. Judges and IR teams must trace every claim back to a specific tool call.
Honest accuracy reporting (missed artifacts, scope limits) builds more trust than claiming full coverage.
What's next for SIFT-Analyst
Disk plus memory correlation agent (cross-source findings from E01 and RAM capture).
Registry and prefetch analysis on full Windows disk images.
ML ablation benchmark comparing Isolation Forest pre-filter vs prompt-only Claude baseline.
Official SANS SIFT Workstation OVA compatibility test matrix.
DATASET DOCUMENTATION Dataset: M57 Patents Scenario — Terry USB Drive File: terry-work-usb-2009-12-11.E01 (31.95 MB) Source: https://digitalcorpora.org/corpora/scenarios/m57-patents-scenario/ SHA-256: 1600fe2bdfb2bec0b006aa9d1c0ce6d3ad0b6666141a333bf830af7800fb9230 Filesystem: FAT32, partition offset 63 License: Public domain
What the agent found: Keylogger (R54402.EXE), stolen credentials in keylog HTML, VNC installer, USPTO automation scripts, 2,471 screenshots, USB connection evidence.
Built With
- amazon-web-services
- claude-code
- langgraph
- mcp
- python
- scikit-learn
- sleuthkit
- ubuntu
- yara
Log in or sign up for Devpost to join the conversation.