About the Project

Inspiration

Digital forensics investigations follow strict methodology — evidence must be reproducible, chain-of-custody must be unbroken, and conclusions must be defensible in court. But current AI tools just dump raw output. There's no reasoning, no methodology tracking, no way to know if the AI actually covered all artifact categories or just hallucinated findings.

We asked: what if we put a reasoning kernel between the LLM and the forensic toolchain? One that enforces methodology, fuses evidence mathematically, and produces reports with cryptographic integrity guarantees?

What it does

sift-kernel is a custom MCP (Model Context Protocol) server that gives any AI agent — Claude, GPT, Gemini — a forensic brain. Point it at a disk image and say "investigate." The kernel handles:

  • 129 forensic operations across 14 categories (registry, event logs, filesystem, memory, network, persistence, execution artifacts, browser, user activity, timeline, anti-forensics, correlation, Linux, and acquisition)
  • FARE reasoning engine — Forensic Active Reasoning Engine using Dempster-Shafer theory with PCR5 conflict resolution for evidence fusion (not naive Bayesian — handles contradictory evidence correctly)
  • Active Inference (Free Energy minimization) to prioritize which tools to run next based on expected information gain
  • Methodology FSM — enforces COLLECTION → TRIAGE → CLASSIFY → INVESTIGATE → TIMELINE → CORRELATE → REPORT progression
  • Hash-chained evidence ledger — every tool invocation, its raw output, and derived findings are stored in a tamper-evident SQLite chain with HMAC sealing
  • Interactive HTML reports with SVG entropy curves, MITRE ATT&CK mapping, kill-chain timeline, and confidence scoring per finding

The math behind evidence fusion:

$$m_{1,2}(A) = \frac{\sum_{B \cap C = A} m_1(B) \cdot m_2(C)}{1 - \sum_{B \cap C = \emptyset} m_1(B) \cdot m_2(C)}$$

When conflict is high ($k > 0.3$), we switch to PCR5 (Proportional Conflict Redistribution):

$$m_{PCR5}(A) = m_{12}(A) + \sum_{B \cap A = \emptyset} \frac{m_1(A)^2 \cdot m_2(B)}{m_1(A) + m_2(B)} + \frac{m_2(A)^2 \cdot m_1(B)}{m_2(A) + m_1(B)}$$

How we built it

TypeScript MCP server with hexagonal architecture:

  • Domain layer: methodology FSM, capability DAG, finding/hypothesis lifecycle, hash-chained ledger
  • Reasoning layer: Dempster-Shafer fusion, Active Inference (EFE), rough-sets confidence tiers, bias detection, convergence monitoring, forensic ontology (27 nodes)
  • Adapter layer: process executor (shell:false + command allowlist for security), SQLite ledger, filesystem evidence store
  • Intelligence layer: beaconing detection, timestomping analysis, log gap identification, burst detection, wiping tool signatures

The server wraps SIFT Workstation tools (Sleuth Kit, RegRipper, Volatility3, Plaso, YARA, tshark, Zimmerman .NET tools) and exposes them as MCP tools that any AI agent can call. The kernel tracks what's been done, what's missing, and what should come next.

Challenges we faced

  1. ESM/CJS boundary — Node.js crypto module import semantics differ between vitest's loader and raw ESM; tests passed but runtime crashed until we caught the require() in pure ESM
  2. Evidence image tooling — SIFT tools expect mounted filesystems; we had to build a raw-offset extraction pipeline using icat/fls/istat directly on E01 images
  3. Conflict in evidence — real investigations produce contradictory signals (timestomped files that look legitimate, cleared logs that might be routine maintenance). Naive fusion breaks; PCR5 handles it correctly
  4. Methodology enforcement without rigidity — the FSM must prevent skipping phases but not block legitimate back-tracking when new evidence changes the picture
  5. 114 unit/integration/property tests covering the reasoning math, ledger integrity, parser correctness, and FSM transitions — ensuring the kernel itself is defensible

What we learned

  • Dempster-Shafer theory is dramatically better than Bayesian updating for forensics because it explicitly models ignorance (the "I don't know" mass) separate from belief
  • Active Inference (minimizing expected free energy) naturally balances exploitation (run what's likely useful) vs exploration (run what reduces uncertainty the most)
  • MCP is the right abstraction — any AI agent that speaks the protocol gets full forensic capability without custom integration
  • Hash chains aren't just for blockchain hype — they make evidence ledgers tamper-evident in a way that matters for court admissibility

Built With

  • active-inference
  • dempster-shafer-theory
  • model-context-protocol-(mcp)
  • node.js
  • pcr5-conflict-resolution
  • plaso
  • regripper
  • sift
  • sleuth-kit
  • sqlite
  • tshark
  • typescript
  • vitest
  • volatility3
  • yara
Share this project:

Updates