Inspiration
Attackers now break out in as little as 27 seconds (CrowdStrike 2026 Global Threat Report, down from 51 seconds a year earlier). And the offense is already an AI: in November 2025 Anthropic disclosed GTG-1002, a state-sponsored group that drove Claude Code through a campaign against around 30 organizations, with the AI performing 80 to 90 percent of the tactical work autonomously.
Defense has to answer at machine speed. But the moment you point an autonomous agent at evidence, you inherit its hallucinations. Protocol SIFT's own creator says it plainly: "It works. It also hallucinates more than we'd like." A machine-speed agent that hallucinates produces evidence no court will accept, and when evidence is excluded for a broken chain of custody, the whole investigation re-runs. The average breach already costs $4.88M (IBM 2024).
Read-only mounting, the "we never touch the original" claim, was table stakes in 2008. What actually decides cases is chain of custody, Daubert admissibility, and a documented error rate. So I built the layer nobody else builds.
What it does
SEALED re-architects Protocol SIFT from prompt files into a purpose-built, typed, read-only Model Context Protocol (MCP) server over the real SIFT Workstation toolchain (Volatility 3, EZ Tools, plaso, Sleuth Kit), driven by Claude Code as the autonomous runtime. The agent triages a host, builds a cross-source timeline, and seals findings, but it cannot call a finding CONFIRMED in prose. It must pass each finding through a provenance gate, and the entire run lands as a cryptographically signed chain of custody.
On the NIST CFReDS "Hacking Case" (the Greg Schardt / "Mr. Evil" wardriving case), SEALED produces 11 of 12 court-sealed findings, 0 hallucinated, scored against the NIST published answer key. The 12th is deliberately abstained, because the tool emits per-account rows, not a total it can prove. That honest INFERRED is the point: the architecture would rather say "I can't seal that" than guess.
How we built it
The server is the protagonist; Claude Code is the runtime that drives it. Every guarantee lives below the agent, enforced in code, never in a prompt an injection could rewrite.
There are four enforced properties. First, capability absence: the registry exposes only typed read and parse functions, so a destructive command, even one injected into a log line in the evidence, has nowhere to run. You can prove this by running python -m sealed_mcp.redteam, which fires nine attacks (injecting dd and rm, forging a seal three different ways, tampering a signed ledger, trying to overwrite source or evidence) and watches every one get blocked, exit code 0. Second, read-only proven: each tool hashes the evidence before and after its call, and identical SHA-256 is the proof, recorded in the ledger, not the mere absence of a shell. Third, a signed chain of custody: every call is prev-hash-chained into an append-only ledger whose head is ed25519-signed, so an attacker who rewrites or shortens the log has to forge a signature under a public key they don't hold. Fourth, a provenance gate: a finding seals CONFIRMED only if the ledger verifies and every relied-on field appears in the rows the tool actually recorded, checked against the hashed bytes, never against a value the agent supplied.
The honesty claim is itself measured, not asserted. Running python -m sealed_mcp.triage triages the NIST case end to end and shows the gate sealing supported claims while downgrading an over-reach to INFERRED, live. Running python accuracy/run_benchmark.py replays every CONFIRMED finding back through the real gate over the rows the tools recorded: SEALED scores 0 hallucinated, while a baseline that asserts CONFIRMED without a trace has all of them scored as hallucinated. And because chain of custody is the thing this panel adjudicates, I shipped a standalone verifier: python -m sealed_mcp.verify re-derives the hash chain, the truncation anchor, and the ed25519 signature against a public key you supply, so you never have to trust me.
That one triage command produces the entire deliverable bundle locally, with no SIFT OVA: an interactive dashboard, a court-ready case report, Daubert provenance cards, the signed custody ledger and its attestation, the findings file, the cross-source timeline, and a gate-verified accuracy report.
THE UNFORGETTABLE MOMENT
Watch the agent try to lie, and the architecture stop it. A forged claim, "the registered owner is Eric R. Johnson," is stamped INFERRED with the gate's printed reason, right beside the true claim, "the registered owner is Greg Schardt," stamped CONFIRMED. Same tool, same ledger entry, same evidence hash. Only the claim differs, and the architecture refuses the false one. You can watch it in the dashboard, or reproduce it live inside Claude Code with the /forgery_demo prompt the server ships.
Challenges we ran into
The hardest part was resisting the overclaim. The most instructive bug was in the gate itself: an early version checked relied-on fields against a tool_outputs value the agent passed in, so a prompt-injected agent could seal a fabricated CONFIRMED finding just by echoing its own lie. The fix was to bind adjudication to the rows the tool actually recorded and hashed, so a forged echo can never seal anything. That exact attack is now a red-team test that runs on every commit. The lesson: an integrity layer is only real if you can attack it and watch it hold.
Accomplishments that we're proud of
The accuracy number is machine-verified and reproducible, not self-reported: the benchmark re-runs each CONFIRMED finding through the real gate over the rows the tools recorded, and a test asserts the committed result is byte-reproducible from the build, so nobody can hand-edit a CONFIRMED back in. The chain of custody is cryptographically signed and independently verifiable, with a standalone verifier a judge can run without trusting me. The MCP surface is genuinely complete: nineteen typed tools, five resources including the live custody ledger, and two prompts, so the protagonist is the server, not the prompt. One command produces the entire court-ready deliverable bundle, offline, and the whole thing ships with 82 passing tests.
What we learned
That the defensible claim for an autonomous forensic agent is narrow and specific, and that narrowness is the product. "It can't be wrong" loses to a cross-examiner in seconds. "It cannot present a finding it can't trace to a verified read of hashed evidence, and here is the signed log to check" wins. Building to that standard forced an architecture where integrity is a property of the code, not a promise in a prompt.
What's next for SEALED
Upstream the typed tools and the integrity layer into Protocol SIFT, deepen the tool surface against a real acquired image on the SIFT Workstation OVA, and broaden the ground-truth benchmark beyond the NIST Hacking Case.
Built With
- claude-code
- cryptography
- ez-tools
- fastmcp
- hashdeep
- mcp
- python
Log in or sign up for Devpost to join the conversation.