Inspiration
In digital forensics, findings are only as reliable as the evidence they are anchored to. Too many pipelines produce raw, unvalidated output that is difficult to trust, full of noise, and impossible to reproduce. We built this agent to change that, creating a self-correcting workflow that prioritizes verifiability, repeatability, and zero false positives.
What it does
Our agent automates the complete digital forensics workflow — from evidence acquisition through analysis to report generation — while ensuring every finding is verifiable and tamper-proof.
Core capabilities:
- Orchestrates 4 DFIR tools automatically: SleuthKit (filesystem), YARA (malware signatures), Plaso (timeline reconstruction), Volatility3 (memory forensics)
- Cross-corroborates findings: Every artifact is independently verified by multiple tools
- Self-corrects: Low-confidence findings automatically trigger re-analysis with expanded parameters (implemented in
self_correct.py— detects cross-tool contradictions and re-runs YARA/SleuthKit with broader rules when confidence < 0.7) - Produces complete audit trails: Every action logged with tool → output → SHA256 hash → timestamp
- Enforces architectural constraints: Read-only mounts, write sandboxes, evidence hashing
Results: 100% detection rate, 0 false positives across 2 independent test cases.
How we built it
📐 Architecture Diagram — Full pipeline with security boundary classification (architectural vs prompt-based guardrails).
Architecture: OpenClaw Agent Extension (Pattern #1)
We extended OpenClaw to create a domain-specific DFIR orchestration agent. The pipeline runs in 6 phases:
- Evidence Discovery: Auto-detect disk images, memory dumps
- Tool Orchestration: Mount read-only → SleuthKit → YARA → Plaso → Volatility3
- Finding Generation: Each tool produces structured findings with evidence bindings
- Artifact-Level Deduplication: Merge findings by file path across tools
- Self-Correction: Check confidence thresholds; re-run with expanded parameters if needed
- Report Generation: JSON + Markdown with complete evidence chains
Tech stack: Python 3.10, OpenClaw, SleuthKit 4.11, YARA 4.1, Plaso, Volatility3 2.28
Key design: Constraints are architectural, not prompt-based. Read-only mount enforced by OS. Evidence integrity enforced by SHA256.
Challenges we ran into
Finding public forensic test data: Most DFIR datasets are restricted. Created synthetic disk images with verifiable ground truth.
Cross-tool artifact normalization: SleuthKit outputs inode-based body files, YARA outputs rule+filepath pairs, Plaso outputs timestamped events. Required careful path normalization to merge.
Plaso version compatibility: Installed version (20201007) is 6+ years old. Works but lacks newer parsers.
Network restrictions: Couldn't download memory dump samples for Volatility3 testing. Integrated but documented as "ready but untested".
Accomplishments that we're proud of
0 false positives across 2 cases: Every finding traces to verifiable tool output.
3-tool cross-corroboration: Critical findings independently confirmed by SleuthKit + YARA + Plaso.
100% detection rate: All 14 ground truth artifacts across 2 cases detected.
Complete audit trail: JSONL log with tool executions, findings, decisions, timestamps, SHA256 hashes.
Cross-scenario generalization: Same agent on different attacks (backdoor vs ransomware), same quality.
What we learned
Timeline analysis is the missing link: Disk-only forensics missed syslog events. Plaso brought detection from 83% to 100%.
Deduplication is essential: 4 tools produce overlapping findings. Artifact-level merging produces clean output.
Architecture beats prompts: "Don't modify evidence" (prompt) fails. Read-only mount (architecture) cannot fail.
Honesty > claims: Openly documenting limitations builds trust with judges.
What's next
- Real memory dump testing with Volatility3 (awaiting official samples)
- Plaso upgrade from 20201007 to 20240308
- Bayesian confidence calibration based on cross-tool agreement
- Network forensics via PCAP analysis (Suricata/Zeek)
- Case management web UI
Update: Third DFIR Case — Safe Malicious Office Document Analysis
To improve analysis breadth beyond backdoor and ransomware scenarios, we added TC-003: Malicious Office Document Analysis.
This case uses a safe, synthetic .docm evidence image with a disarmed macro indicator set. The sample is non-destructive: the embedded vbaProject.bin is static text, contains explicit DISARMED DEMO / DFIR-DEMO-NOEXEC markers, and does not execute, download, encrypt, persist, exfiltrate, or modify anything.
What the agent detects:
- Macro-enabled Office document:
invoice_urgent.docm - Mark-of-the-Web /
Zone.Identifierdownload marker - Embedded VBA project artifact
AutoOpenmacro auto-execution indicatorCreateObject/WScript.Shellautomation strings- PowerShell
EncodedCommandstring, safely marked as demo-only
Validated result:
- 37 evidence-anchored findings
- 5 critical, 22 high, 6 medium, 4 info
- SleuthKit + YARA + Plaso corroboration
- 7 self-correction triggers, 7/7 resolved
- 100% detection of the 6 synthetic ground-truth artifact classes
- 0 false positives against the case ground truth
The project now demonstrates three distinct DFIR investigation categories: Linux backdoor, ransomware, and malicious document triage.
Log in or sign up for Devpost to join the conversation.