Inspiration
Triaging a single suspicious binary can consume 60–90 minutes of an analyst's day --- hashing, YARA scanning, Volatility runs, IOC extraction, and report writing, all by hand. Meanwhile, adversaries churn out thousands of samples per hour. SIFT-AID was built to close that gap: a local-first, privacy-preserving triage agent that compresses the same workflow to under 8 minutes without sacrificing audit integrity. Inspired by the SANS SIFT Workstation, it proves an LLM-guided pipeline can think like a senior analyst without ever leaving the workstation.
What it does
SIFT-AID is a fully autonomous malware triage agent running entirely on the SANS SIFT Workstation. Feed it a suspicious binary or memory image and it handles everything end-to-end:
- SHA-256 hashing and VirusTotal queries
- YARA scanning across 15+ families
- IOC extraction via strings and regex
- Volatility 3 memory forensics with 12 whitelisted plugins
- Docker sandbox behavioral analysis with read-only
:roprivileges - MITRE ATT&CK technique mapping
- Confidence-scored cross-validation
- Analyst-reviewed containment rules for
iptables/nftables
Output is a dual-format report (JSON + Markdown) plus a valid STIX 2.1 bundle ready for SIEM/SOAR ingestion --- all in a single automated pipeline.
How we built it
The orchestration layer is LangGraph 0.2+ with 12 named nodes and cyclic self-correction. An MCP server acts as the security boundary --- exposing exactly 12 typed, read-safe functions with no generic shell execution.
Ten specialist agents handle the pipeline:
Hash·YARA·Volatility·IOC·BinaryAnalysis·EntropyAnalysis·VulnerabilityCheck·NetworkIntel·DynamicAnalysis·Containment
These are backed by a MITRE ATT&CK mapper and a LanceDB IOC store for cross-incident correlation. Evidence is mounted kernel-level read-only (:ro) inside Docker, the service runs as a non-root sentinel user, and a FastAPI dashboard streams real-time LangGraph node progress over WebSockets.
Challenges we ran into
- Volatility output bloat ---
malfindoutput can balloon past 10 MB. We handle it with 64 KB trimming and explicit[TRIMMED]markers so the model knows when data was cut. - Resource contention --- parallel plugin execution against the same image caused conflicts, so we moved to sequential processing.
- YARA portability --- rule compatibility between
python-yaraand the CLI required atry/exceptdegradation path. - Dynamic analysis without CAPE --- we built an ephemeral per-sample Docker sandbox that spins up and tears down automatically.
- Hallucination prevention --- every finding in the JSON report must cite its exact MCP function, timestamp, and raw output snippet. The
validatenode rejects anything that lacks provenance.
Accomplishments that we're proud of
SIFT-AID consistently triages complex NIST CFReDS and DFRWS forensic images in 144–184 seconds --- well inside the 8-minute SLA --- with 100% precision and zero false positives across four malicious datasets and 15 clean-software baselines.
The architectural guardrails are genuinely novel:
- An MCP server with no generic execution endpoints
- Kernel-level read-only evidence mounting
- A Volatility plugin whitelist enforced before any subprocess call
- Every finding in every report is programmatically traceable to its source
We're especially proud that the self-correction loop correctly escalated the ambiguous DFRWS 2005 steganography challenge to "Analyst Review" rather than forcing a verdict.
What we learned
- Design the MCP tool interface first. Defining what the LLM is allowed to do before writing any agent code forces real architectural thinking --- far more effective than prompt-level guardrails, which can be bypassed.
- LangGraph's cyclic graph with conditional routing makes self-correction explicit and auditable: every state transition is a named edge.
- Voting beats probability calibration for confidence scoring without a calibration dataset --- each corroborating tool adds a vote, making results more transparent and explainable.
- A local LLM can orchestrate complex forensic toolchains effectively, proving that AI-assisted DFIR requires neither cloud APIs nor data exfiltration.
What's next for SIFT-AID
- Multi-sample batch mode with queue management for high-volume triage
- Expanded MITRE ATT&CK coverage to 100+ techniques
- MISP integration for bidirectional threat intelligence sharing
- Timeline analysis --- super-timeline generation from MFT, journal, and logs, plus automated spoliation detection
- Cross-organization threat intel feed evolving from the LanceDB IOC memory
- Extended forensic image support beyond Windows and Linux to macOS and Android
Built With
- docker
- fastapi
- lancedb
- langgraph
- mcp
- ollama
- python
- python-yara
- stix
- virustotal
- volatility

Log in or sign up for Devpost to join the conversation.