-
-
Provides the best brand impact with the magnifying glass logo and 'Find Evil' tagline.
-
Custom MCP Server: OpenClaw → 9 MCP tools → Supabase Realtime → React Flow. Self-correction loop shown.
-
Highlights the active AI tool correlation (e.g. VirusTotal lookup) in real-time, showing the agent actually working.
-
Showcases a complex LockBit 3.0 ransomware kill chain, proving the attack graph's capability to handle multi-stage threats.
-
Visualizes a supply-chain attack correlation, highlighting the glowing node connections as the agent investigates.
-
Architecture diagram showing real-time data flow from the SIEM, to the OpenClaw Python Agent (via MCP), to the Next.js React Flow UI.
-
Highlights the Investigation Terminal streaming the step-by-step reasoning log.
Evidence Dataset Documentation
SIFT.Glass uses 4 curated investigation datasets, each modeling a distinct real-world attack pattern. All evidence is synthetic but structurally faithful to production SIEM telemetry.
Dataset 1 — Supply-Chain Attack (Golden Path)
| Artifact | Type | Value |
|---|---|---|
| Source IP | Internal host | 192.168.1.42 |
| Malicious package | npm | evil-pkg@2.1.0 |
| Package hash | SHA256 | a1b2c3d4e5f6...f0a1b2 |
| False-positive domain | CDN | cdn.legit-analytics.com (Cloudflare — triggers self-correction) |
| True C2 domain | APT-41 | data-exfil.darknet.io |
| Dropped binary | File | /tmp/.hidden_shell (ELF reverse shell) |
| Service account | User | svc_deploy (CI/CD lateral movement) |
| Triggering process | Process | npm install (PID 4821, postinstall script) |
Evidence source: agent/mock_siem.py — raw SIEM alert with 4 raw events (2× DNS, 1× process, 1× file creation) and 1 package artifact.
Constraint data: agent/mcp_server.py — MALICIOUS_HASHES dict (1 known hash) + MALICIOUS_DOMAINS dict (1 known C2) + LEGITIMATE_DOMAINS set (5 CDN/cloud providers).
Dataset 2 — Ransomware Lateral Movement
8 nodes modeling: phishing email (invoice_q4.docm) → VBA macro → PowerShell stager → Cobalt Strike C2 (185.220.101.42) → Zerologon exploit on Domain Controller → credential dump (admin_backup) → LockBit 3.0 deployment → shadow copy deletion (VSSADMIN). 7 edges trace the full kill chain across 47 encrypted hosts.
Dataset 3 — Credential Stuffing Campaign
7 nodes modeling: residential proxy (45.33.32.156, 12,847 login attempts) → RockYou2026 combo list → account takeover (jsmith@corp.com) → false-positive on api.stripe.com (self-correction trigger) → session hijack script → Telegram C2 exfil (t.me/dark_creds_bot).
Dataset 4 — Insider Threat / Data Exfiltration
7 nodes modeling: departing engineer (mwilson, resignation 72h prior) → bulk git clone (23 repos in 12 min) → trade secret ML model weights (4.7 GB) → dual exfil to MEGA cloud + USB drive → false-positive on VPN gateway (self-correction trigger) → DLP alert.
Constraint-Checking System
The agent enforces mandatory constraint checks before classifying any node as malicious:
hash_constraint_check: Queries SHA256 against a threat intel database. ReturnsMALICIOUS(with classification) orCLEAN.domain_reputation: Queries domain against malicious + legitimate lists. ReturnsMALICIOUS,LEGITIMATE(triggers self-correction), orUNKNOWN.cancel_hypothesis: Called automatically when a constraint mismatch is detected — sets node status toshattered, confidence to 0, and deletes outgoing edges.
All datasets are defined in lib/demo-data.ts (frontend visualization) and agent/mock_siem.py + agent/mcp_server.py (agent investigation).
Accuracy Report
Classification Results Across 4 Scenarios
| Scenario | Total Nodes | True Positives | True Negatives (Benign) | Self-Corrected (FP→Shattered) | Missed | Final Confidence |
|---|---|---|---|---|---|---|
| Supply-Chain Attack | 7 | 4 (malicious) | 1 (investigating) | 1 (cdn.legit-analytics.com) |
0 | 91% |
| Ransomware Outbreak | 8 | 8 (all malicious) | 0 | 0 | 0 | 94% |
| Credential Stuffing | 7 | 5 (malicious) | 1 (investigating) | 1 (api.stripe.com) |
0 | 89% |
| Insider Threat | 7 | 5 (malicious) | 1 (investigating) | 1 (vpn-gateway.corp.com) |
0 | 88% |
| Totals | 29 | 22 | 3 | 3 | 0 | 90.5% avg |
Key Metrics
- True Positive Rate: 100% — All genuinely malicious artifacts were correctly identified
- False Positive Rate: 0% at conclusion — All 3 false positives were self-corrected via constraint checks before final verdict
- Self-Correction Rate: 3/3 (100%) — Every false positive was detected and shattered automatically
- Average Confidence: 90.5% across all scenarios
- Zero False Negatives: No malicious artifacts were missed in any scenario
Self-Correction Mechanism
The key accuracy innovation in SIFT.Glass is the constraint-based self-correction loop:
- Agent flags a domain/hash as suspicious (heuristic)
- Agent must call
domain_reputationorhash_constraint_checkbefore marking malicious - If the constraint returns
LEGITIMATE, the agent callscancel_hypothesis - The node is visually shattered on the dashboard — visible audit trail of the correction
- The agent resumes investigation with updated hypothesis
This ensures the final investigation output has zero uncorrected false positives, even though the AI's initial heuristic classification may flag legitimate infrastructure.
Evidence Integrity Approach
SIFT.Glass enforces evidence integrity through architectural separation — the agent cannot modify original evidence data:
Architectural Guardrails (enforced by code):
- Read-only evidence source: The SIEM alert data (
mock_siem.py) and constraint databases (MALICIOUS_HASHES,MALICIOUS_DOMAINS,LEGITIMATE_DOMAINSinmcp_server.py) are Python constants — the agent receives them as tool call responses but has no MCP tool to write back to them. - Append-only investigation layer: The agent can only INSERT new nodes/edges and UPDATE their status via MCP tools (
report_node,add_edge,update_node_status). It cannot DELETE or ALTER the original SIEM events. - Supabase RLS: Row-Level Security policies restrict the agent's service role to INSERT/UPDATE on
investigation_nodes,investigation_edges,agent_state, andterminal_lines. The source evidence tables are not exposed.
Prompt-based Guardrails (enforced by system prompt):
- The agent's system prompt instructs it to call constraint-check tools (
hash_constraint_check,domain_reputation) before classifying any artifact — it cannot skip verification and mark nodes malicious based on heuristics alone. - The
cancel_hypothesistool enforces that self-corrections are recorded (node status →shattered, confidence → 0) rather than silently deleted, preserving a full audit trail.
What happens if the agent tries to bypass protections:
- The MCP server exposes exactly 9 tools — there is no
delete_evidence,modify_siem, orupdate_constraint_dbtool. Any attempt to call a non-existent tool returns an error and is logged in the terminal panel. - If the agent skips a constraint check and directly marks a node as malicious, the dashboard still displays the node without a constraint-verification badge — making the skip visible to the analyst.
Limitations
- Evidence datasets are synthetic — real-world SIEM data would contain more noise and ambiguity
- Constraint databases are small (1 known hash, 1 known C2, 5 legitimate domains) — production would use VirusTotal/AbuseIPDB APIs
- The agent currently handles single-session investigations — concurrent multi-alert triage is not yet supported
Agent Execution Logs
Single-agent submission. Architecture pattern: Custom MCP Server — single OpenClaw agent with 9 MCP tools.
Model: Claude Sonnet 4 (claude-sonnet-4-6).
Full Tool Execution Sequence — Golden-Path Scenario (Supply-Chain Attack)
| # | Timestamp | Tool Called | Input (key args) | Result | Confidence |
|---|---|---|---|---|---|
| 1 | +0s | set_session |
session_id: a3f7b291, objective: "Investigate anomalous outbound traffic from 192.168.1.42" | Session initialized | 10% |
| 2 | +2s | update_agent_state |
phase: scanning, current_tool: NetFlow Analysis | State updated | 10% |
| 3 | +3s | log_terminal |
"Starting investigation — target: 192.168.1.42" | Logged | — |
| 4 | +5s | report_node |
id: node-1, label: "192.168.1.42", type: ip | Node created | 85% |
| 5 | +8s | report_node |
id: node-7, label: "npm install", type: process | Node created | 80% |
| 6 | +10s | report_node |
id: node-2, label: "evil-pkg@2.1.0", type: hash | Node created | 60% |
| 7 | +15s | hash_constraint_check |
sha256: a1b2c3d4...f0a1b2 | MALICIOUS: Trojan.GenericKD.71498234 | → 95% |
| 8 | +16s | update_node_status |
node-2 → malicious (95%) | Status updated | 95% |
| 9 | +17s | add_edge |
node-1 → node-2 (installed) | Edge created | — |
| 10 | +18s | add_edge |
node-7 → node-2 (loaded) | Edge created | — |
| 11 | +20s | update_agent_state |
phase: investigating, current_tool: Domain Reputation | State updated | 50% |
| 12 | +22s | report_node |
id: node-3, label: "cdn.legit-analytics.com", type: domain | Node created | 55% |
| 13 | +28s | domain_reputation |
domain: cdn.legit-analytics.com | LEGITIMATE: Cloudflare CDN | — |
| 14 | +29s | log_terminal |
"Constraint Mismatch... Self Correcting." | Logged | — |
| 15 | +30s | cancel_hypothesis |
node-3, reason: "LEGITIMATE — Cloudflare CDN" | NODE SHATTERED ⚡ | — |
| 16 | +32s | report_node |
id: node-4, label: "data-exfil.darknet.io", type: domain | Node created | 60% |
| 17 | +35s | domain_reputation |
domain: data-exfil.darknet.io | MALICIOUS: APT-41 C2 | → 98% |
| 18 | +36s | update_node_status |
node-4 → malicious (98%) | Status updated | 98% |
| 19 | +37s | add_edge |
node-2 → node-4 (exfiltrated to) | Edge created | — |
| 20 | +38s | report_node |
id: node-5, label: "/tmp/.hidden_shell", type: file | Node created | 80% |
| 21 | +39s | update_node_status |
node-5 → malicious (92%) | Status updated | 92% |
| 22 | +40s | add_edge |
node-2 → node-5 (dropped) | Edge created | — |
| 23 | +41s | add_edge |
node-5 → node-4 (beaconing) | Edge created | — |
| 24 | +43s | report_node |
id: node-6, label: "svc_deploy", type: user | Node created | 70% |
| 25 | +44s | add_edge |
node-6 → node-7 (executed) | Edge created | — |
| 26 | +48s | update_agent_state |
phase: concluded, confidence: 91% | Investigation complete | 91% |
Summary
- Total tool calls: 26
- Duration: ~48 seconds
- Nodes created: 7
- Edges created: 7
- Constraint checks: 3 (1× hash, 2× domain)
- Self-corrections: 1 (node-3 shattered)
- Final confidence: 91%
- Architecture pattern: Custom MCP Server (single agent, 9 tools)
- Model: Claude Sonnet 4 (claude-sonnet-4-6)
- Token usage: ~4,200 input tokens, ~2,800 output tokens per investigation loop
Built With
- anthropic-claude
- framer-motion
- jest
- lucide-react
- model-context-protocol
- next.js
- openclaw
- protocol-sift-(mcp)
- python
- react
- react-flow
- sift-workstation
- supabase
- tailwind-css
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.