sift-kernel

About the Project

Inspiration

Digital forensics investigations follow strict methodology — evidence must be reproducible, chain-of-custody must be unbroken, and conclusions must be defensible in court. But current AI tools just dump raw output. There's no reasoning, no methodology tracking, no way to know if the AI actually covered all artifact categories or just hallucinated findings.

We asked: what if we put a reasoning kernel between the LLM and the forensic toolchain? One that enforces methodology, fuses evidence mathematically, and produces reports with cryptographic integrity guarantees?

What it does

sift-kernel is a custom MCP (Model Context Protocol) server that gives any AI agent — Claude, GPT, Gemini — a forensic brain. Point it at a disk image and say "investigate." The kernel handles:

129 forensic operations across 14 categories (registry, event logs, filesystem, memory, network, persistence, execution artifacts, browser, user activity, timeline, anti-forensics, correlation, Linux, and acquisition)
FARE reasoning engine — Forensic Active Reasoning Engine using Dempster-Shafer theory with PCR5 conflict resolution for evidence fusion (not naive Bayesian — handles contradictory evidence correctly)
Active Inference (Free Energy minimization) to prioritize which tools to run next based on expected information gain
Methodology FSM — enforces COLLECTION → TRIAGE → CLASSIFY → INVESTIGATE → TIMELINE → CORRELATE → REPORT progression
Hash-chained evidence ledger — every tool invocation, its raw output, and derived findings are stored in a tamper-evident SQLite chain with HMAC sealing
Interactive HTML reports with SVG entropy curves, MITRE ATT&CK mapping, kill-chain timeline, and confidence scoring per finding

The math behind evidence fusion:

$$m_{1,2}(A) = \frac{\sum_{B \cap C = A} m_1(B) \cdot m_2(C)}{1 - \sum_{B \cap C = \emptyset} m_1(B) \cdot m_2(C)}$$

When conflict is high ($k > 0.3$), we switch to PCR5 (Proportional Conflict Redistribution):

$$m_{PCR5}(A) = m_{12}(A) + \sum_{B \cap A = \emptyset} \frac{m_1(A)^2 \cdot m_2(B)}{m_1(A) + m_2(B)} + \frac{m_2(A)^2 \cdot m_1(B)}{m_2(A) + m_1(B)}$$

How we built it

TypeScript MCP server with hexagonal architecture:

Domain layer: methodology FSM, capability DAG, finding/hypothesis lifecycle, hash-chained ledger
Reasoning layer: Dempster-Shafer fusion, Active Inference (EFE), rough-sets confidence tiers, bias detection, convergence monitoring, forensic ontology (27 nodes)
Adapter layer: process executor (shell:false + command allowlist for security), SQLite ledger, filesystem evidence store
Intelligence layer: beaconing detection, timestomping analysis, log gap identification, burst detection, wiping tool signatures

The server wraps SIFT Workstation tools (Sleuth Kit, RegRipper, Volatility3, Plaso, YARA, tshark, Zimmerman .NET tools) and exposes them as MCP tools that any AI agent can call. The kernel tracks what's been done, what's missing, and what should come next.

Challenges we faced

ESM/CJS boundary — Node.js crypto module import semantics differ between vitest's loader and raw ESM; tests passed but runtime crashed until we caught the require() in pure ESM
Evidence image tooling — SIFT tools expect mounted filesystems; we had to build a raw-offset extraction pipeline using icat/fls/istat directly on E01 images
Conflict in evidence — real investigations produce contradictory signals (timestomped files that look legitimate, cleared logs that might be routine maintenance). Naive fusion breaks; PCR5 handles it correctly
Methodology enforcement without rigidity — the FSM must prevent skipping phases but not block legitimate back-tracking when new evidence changes the picture
114 unit/integration/property tests covering the reasoning math, ledger integrity, parser correctness, and FSM transitions — ensuring the kernel itself is defensible

What we learned

Dempster-Shafer theory is dramatically better than Bayesian updating for forensics because it explicitly models ignorance (the "I don't know" mass) separate from belief
Active Inference (minimizing expected free energy) naturally balances exploitation (run what's likely useful) vs exploration (run what reduces uncertainty the most)
MCP is the right abstraction — any AI agent that speaks the protocol gets full forensic capability without custom integration
Hash chains aren't just for blockchain hype — they make evidence ledgers tamper-evident in a way that matters for court admissibility

Built With

active-inference
dempster-shafer-theory
model-context-protocol-(mcp)
node.js
pcr5-conflict-resolution
plaso
regripper
sift
sleuth-kit
sqlite
tshark
typescript
vitest
volatility3
yara

Submitted to

FIND EVIL!

Created by

Solo developer. I designed the architecture, wrote all code, ran all investigations, and produced the final submission.
Specifically:
- Designed the FARE reasoning engine (Dempster-Shafer/PCR5 fusion, Active Inference tool selection, methodology FSM)
- Implemented the full TypeScript MCP server (2,300+ lines of server.ts, 129 forensic operations across 14 categories)
- Built the hash-chained SQLite evidence ledger with HMAC sealing
- Created the capability DAG and methodology state machine (COLLECTION through REPORT)
- Wrote 114 unit/integration/property tests (vitest)
- Set up SIFT Workstation tooling via podman container with shell wrappers
- Ran live investigations on two forensic disk images (SRL-2018 + ROCBA cases)
- Produced all documentation (architecture, accuracy report, dataset docs, execution logs)
- Recorded the demo video
No teammates. No code contributions from others. AI coding assistants were used as tools during development (same as using an IDE with autocomplete).

Sathvik Gilakamsetty

Updates

Sathvik Gilakamsetty started this project — Jun 15, 2026 11:38 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.