CyberProof

Automated, W3C PROV-verified cyber insurance evidence in minutes

Inspiration

Every cyber incident starts a regulatory clock. 24 hours for NIS2's early warning, 72 hours for GDPR Article 33, and insurers typically expect notice within 48. Yet the evidence package needed to file a claim is still assembled by hand, over days or weeks. At the same time, W3C PROV offers a powerful idea: capture why a system produced a result, not just the result-turning a claim into something verifiable. SOAR playbooks already execute a structured response. What if that execution became the provenance record, and that record became the backbone of a defensible insurance claim?

What it does

CyberProof is a zero-touch pipeline that turns a completed Splunk SOAR playbook run into a court-grade cyber insurance evidence package, automatically.

When a playbook's fires:

Provenance capture : builds a W3C PROV-JSON graph (Activities, Agents, wasInformedBy causal chain) from the SOAR REST API, SHA-256 hashed and rendered to SVG.
Forensic enrichment : extracts the actual SPL queries the playbook ran from SOAR's logs and re-runs them via the Splunk MCP Server against BOTS v3 for real attacker timelines.
Legal evidence generation : SaulLM-7B generates a multi-section insurance package: incident summary, causal-chain proof, regulatory deadlines, financial impact, coverage clauses, chain of custody, and forensic evidence.
Dashboard delivery : posted to Splunk via HEC and visualized in Dashboard Studio: NIS2/GDPR/insurance countdowns, total claim, attack timeline, and links to every artifact.

From "playbook finished" to "claim amount + regulatory status + signed evidence document" — in about a minute, no human in the loop.

How it ahs been built

SOAR adapter for W3C PROV, following yProv4WFs's existing plugin pattern: container → Activity(level0), playbook_run → Activity(level1), action_run → Activity(level2), app/asset → Agent, cb_fn → wasInformedBy. Validated against a real SOAR 8.5.0 instance and a custom playbook on the BOTS v3 "Operation Frothly" scenario.
Dynamic MCP enrichment: parses the For Parameter: {...} Message: JSON in each app_run to recover the exact query the playbook ran, then dispatches it to the Splunk MCP Server. Fully playbook-agnostic.
Evidence generation : iterated prompts for consistent sections, deduplicated timelines, accurate deadline math, and currency-separated financials, with rates/metadata in a single config file.
Dashboard: built in Dashboard Studio with spath-based extraction, color-coded deadline table, financial breakdown, and a live attack-timeline table from index=botsv3.
Auto-trigger: a Flask listener receives on_finish()'s POST and runs the pipeline in the background- zero manual steps.

What I learned

Provenance is a trust layer. Separating deterministic provenance capture from LLM narrative means the graph is the source of truth, and the LLM just summarizes it.
Chain-of-custody needs proof, not assertion. A SHA-256 hash turned "unaltered" from a claim into something verifiable, cheaply.

What's next for CyberProof

Splunk AI Assistant (SAIA) integration for natural-language → SPL generation.
Branching/parallel provenance for multi-path investigations.
More playbook types : ransomware, BEC, data exfiltration.
Local SaulLM deployment for air-gapped/data-sensitive environments.
MITRE ATT&CK annotation of the provenance graph.
Blockchain-anchored provenance hashes for independent, third-party-verifiable chain of custody between insurer and insured.

Built With

botsv3
hec
huggingface
mcp
prov
python
saullm-7b
soar
splunk-mcp
w3c
yprov4wfs

Updates

hapix os started this project — Jun 15, 2026 11:31 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.