Guardian Ops

Inspiration

Women's safety is not a niche problem. 1 in 3 women globally experience physical or sexual violence in their lifetime. When danger strikes — a late-night walk, an abusive situation, a moment of crisis — the difference between safety and tragedy can be measured in seconds.

ShieldHer started as my answer to that: a full-featured women's safety PWA built for Mind the Product World Product Day 2026, with one-tap SOS, shake/voice triggers, journey mode, AI crisis companion, fake call with teleprompter, disguise calculator mode, encrypted incident journal, and offline emergency numbers for 195 countries.

But ShieldHer was missing something critical: an operations brain. When an SOS fires, who is watching? When the AI companion detects a crisis signal at 2am, what happens next? Who analyzes patterns across incidents to identify dangerous zones before the next incident?

Guardian Ops is that brain. It transforms ShieldHer from a user-facing app into an enterprise-grade AI-native safety operations system — powered by Splunk.

What It Does

Guardian Ops creates a complete real-time safety operations loop:

ShieldHer PWA (SOS / shake / voice / panic PIN / crisis AI)
        │
        ▼
Guardian Emitter (SHA-256 anonymisation, IndexedDB offline queue)
        │
        ▼
Supabase guardian_events table (RLS secured, service-role only)
        │
        ▼ (Supabase Database Webhook on INSERT)
HEC Forwarder Edge Function
        │
        ▼
Splunk HEC → index=guardian_ops
        │    sourcetype=shieldher:incident
        │    1,400+ real events indexed
        │
        ▼ (5 real-time alert rules fire on severity ≥ 6)
Guardian Ops Agent Loop Edge Function
        │
        ├── AGENT 1: TRIAGE
        │     └── Queries Splunk MCP Server (JSON-RPC 2.0)
        │           └── splunk_run_query: incident history (7 days, 1.5km radius)
        │           └── splunk_run_query: zone statistics (30 days)
        │           └── Groq llama-3.3-70b reasons over real Splunk context
        │           └── Outputs: risk_level, confidence, escalate, context_summary
        │
        ├── AGENT 2: RESPONSE (only if escalate=true)
        │     └── alert_contacts → Twilio SMS/WhatsApp to trusted contacts
        │     └── escalate_emergency → 24/7 operator notification
        │     └── update_heatmap → anonymous community safety heatmap
        │     └── create_notable → Splunk SOC notable event
        │
        └── AGENT 3: AUDIT
              └── Groq synthesizes 3-sentence compliance summary
              └── Writes 25-field audit event back to Splunk HEC
              └── sourcetype=shieldher:agent_decision
              └── Fields: splunk_mcp_used, triage_splunk_hits, full reasoning chain
        │
        ▼
Guardian Ops SOC Dashboard (live Splunk app)
        ├── Live incident pipeline feed
        ├── Agent audit trail with AI summaries
        ├── Splunk notable events SOC queue
        ├── Geographic zone risk table
        ├── AI crisis score distribution
        └── Pipeline timing chart (triage_ms, response_ms, total_ms)

In numbers:

Pipeline completes in under 20 seconds end-to-end
1,400+ real events indexed from actual PWA usage in Kota, India
3 agents, 5 Splunk alert rules, 6 dashboard panels
splunk_mcp_used: true confirmed in every audit event
triage_splunk_hits: 1+ — real Splunk data enriching AI decisions

Architecture Deep Dive

Layer 1 — Event Emission (ShieldHer PWA)

The Guardian Emitter is a drop-in TypeScript library that integrates with any Next.js app. It:

Hashes user IDs with SHA-256 + salt client-side before any network call — zero PII ever leaves the device
Maintains an IndexedDB offline queue — if the user triggers SOS with no connectivity, events are queued and forwarded automatically when connectivity restores
Fires immediately on button press — before ShieldHer's own API call, ensuring Splunk gets the event even if the backend fails
Emits 14 event types: sos_triggered, shake_sos, voice_sos, panic_pin_entered, crisis_detected, route_deviation, checkin_timeout, journey_started, journey_completed, journal_entry, community_report, and more

Layer 2 — Ingestion (Supabase → Splunk HEC)

The HEC Forwarder is a Supabase Edge Function triggered by a database webhook on every guardian_events INSERT. It:

Enriches events with severity scores (1-10 scale: SOS=10, panic_pin=9, route_deviation=7, checkin_timeout=6)
Computes severity_label (critical/high/medium/low)
Casts lat/lon/severity as explicit numbers (Splunk indexes JSON strings by default — tonumber() in SPL required for geo queries)
Forwards as structured JSON to Splunk HEC with correct sourcetype=shieldher:incident
Writes audit records to splunk_forward_audit table for retry tracking

Layer 3 — SIEM (Splunk Enterprise)

5 real-time alert rules fire within 60 seconds of events arriving:

Critical SOS Alert — any sos_triggered, shake_sos, voice_sos (severity=10)
AI Crisis Detected — crisis_detected with ai_crisis_score >= 0.7
Route Deviation in High-Risk Zone — route_deviation + geo_zone_risk=high + deviation_metres >= 200
Panic PIN Entered — panic_pin_entered (silent distress signal via disguise calculator)
Repeat Check-in Timeout — 2+ missed check-ins for same user in 30-minute window

Each alert fires a webhook to the Agent Loop Edge Function.

5 scheduled reports run automatically:

Hourly incident KPI rollup
Daily geographic risk hotspot analysis
Agent decision audit log (compliance)
MCP tool usage performance
Response time SLA report

Layer 4 — AI Agents (Splunk MCP + Groq)

Triage Agent — intelligence gathering specialist:

Connects to Splunk MCP Server v1.2.0 via official JSON-RPC 2.0 protocol
Calls initialize to establish session, then tools/call with tool name splunk_run_query
Queries incident history: index=guardian_ops sourcetype=shieldher:incident earliest=-168h | where tonumber(latitude)>=X AND ... | stats count by geo_zone_risk
Queries zone statistics: 30-day aggregate risk for the incident location
Groq llama-3.3-70b reasons over real Splunk data and outputs structured TriageDecision
Fields: risk_level, confidence, escalate, reasoning, context_summary, splunk_method

Response Agent — action execution specialist:

Only activates if Triage says escalate: true
Has 5 tools: alert_contacts, escalate_emergency, update_heatmap, create_notable, submit_response
Executes actions in parallel where possible
Creates Splunk notable events via HEC for human SOC analyst review
Updates anonymous community safety heatmap with every high-severity incident

Audit Agent — compliance and learning specialist:

Always runs regardless of triage/response outcome
Groq synthesizes a 3-sentence compliance-grade audit summary
Writes 25-field structured event to Splunk: all triage fields, all response fields, splunk_mcp_used, triage_splunk_hits, audit_summary, agent_model, twilio_mock, incident metadata
This creates a permanent, queryable audit trail in Splunk of every AI decision

Layer 5 — SOC Dashboard (Guardian Ops Splunk App)

A fully packaged, installable Splunk app with 6 dashboard panels:

Live Incident Pipeline Feed — real-time table with risk labels, AI scores, context summaries
Risk Level Distribution — pie chart of critical/high/medium/low
Splunk Notable Events Queue — SOC analyst review queue with urgency levels
Agent Audit Trail — full decision log with AI summaries, confidence scores, pipeline times
Triage Reasoning Log — agent's full reasoning chain per incident
Pipeline Performance — stacked bar chart of triage_ms vs response_ms per event

Splunk MCP Server Integration

This is the technical centrepiece of Guardian Ops and the feature I'm most proud of.

The Splunk MCP Server (Splunkbase app 7931, v1.2.0) exposes Splunk's search capabilities via the Model Context Protocol — a standardised JSON-RPC 2.0 interface for AI agents.

My integration:

// 1. Initialize MCP session
const initRes = await fetch(`${SPLUNK_URL}/services/mcp`, {
  method: "POST",
  headers: { "Authorization": `Bearer ${MCP_TOKEN}`, "Mcp-Session-Id": sessionId },
  body: JSON.stringify({
    jsonrpc: "2.0", id: 1, method: "initialize",
    params: { protocolVersion: "2024-11-05", clientInfo: { name: "guardian-ops" } }
  })
});

// 2. Query Splunk with real SPL via MCP
const queryRes = await fetch(`${SPLUNK_URL}/services/mcp`, {
  method: "POST",
  headers: { "Authorization": `Bearer ${MCP_TOKEN}`, "Mcp-Session-Id": sessionId },
  body: JSON.stringify({
    jsonrpc: "2.0", id: 2, method: "tools/call",
    params: {
      name: "splunk_run_query",  // Correct tool name in MCP Server v1.2.0
      arguments: {
        query: "search index=guardian_ops sourcetype=shieldher:incident earliest=-168h | where tonumber(latitude)>=25.0 ...",
        earliest_time: "-7d",
        row_limit: 25
      }
    }
  })
});

Result: The Triage Agent receives real Splunk data — actual historical incidents near the user's location — and uses that context to make a more informed escalation decision. This is not a mock. This is production SIEM data enriching live AI decisions.

Confirmed in Splunk audit events:

splunk_integration: "mcp" ✅
splunk_mcp_used: true ✅
triage_splunk_hits: 1+ ✅
triage_context_summary: "sos_triggered incident in high-risk zone with high incident history..." ✅

Privacy & Security Design

Women's safety apps handle sensitive data. I designed Guardian Ops with privacy-first architecture:

Zero PII in Splunk — user IDs are SHA-256 hashed with a per-deployment salt before any event is emitted. Splunk only ever sees user_hash: "a1b2c3d4..." — never names, emails, or phone numbers
Row-level security — guardian_events table has RLS enabled with service-role-only access. No user-facing reads
Encrypted journal — ShieldHer's incident journal is AES-256 encrypted in Supabase. Guardian Ops only emits a counter event to Splunk — never journal content
Anonymous heatmap — community safety reports contribute to aggregate geo risk data with no user linkage
Offline resilience — events queued in IndexedDB during connectivity loss, forwarded automatically on reconnect

Challenges

1. Splunk MCP tool name discovery The Splunk MCP Server v1.2.0 uses splunk_run_query as the tool name, not run_splunk_query (which I initially assumed). Discovering this required implementing tools/list to enumerate available tools.

2. Float field comparisons in SPL Latitude and longitude arrive in Splunk as JSON strings (e.g., "25.1312917695237"). Standard SPL comparisons (latitude>=25.0) failed because Splunk treats them as strings. Fix: | where tonumber(latitude)>=25.0 — but this only works after the search command, requiring restructured queries.

3. Groq TPM rate limiting The free Groq tier has a 12,000 token-per-minute limit. Three sequential agents each using ~3,000 tokens = rate limit on agent 3. Solution: 4-second inter-agent delay + exponential backoff retry with exact wait time parsed from Groq's error message.

4. Deno bundler TypeScript compatibility Supabase Edge Functions use a Deno bundler that rejects certain TypeScript patterns (class method return type annotations, Promise<{...}> on standalone functions). Solution: convert all class methods to arrow function syntax (callTool = async (name, args) => {}), remove all explicit return type annotations.

5. Cloudflare tunnel stability Splunk runs locally (Windows), requiring Cloudflare Tunnel to expose ports 8088 (HEC) and 8089 (REST/MCP). Free tunnels generate new URLs on every restart, requiring secrets to be updated. For production, named Cloudflare tunnels would solve this.

What I Learned

The Splunk MCP Server is genuinely powerful for agentic AI — querying production SIEM data mid-reasoning-loop changes what's possible in security operations
Multi-agent architectures with distinct roles (gather/act/audit) are more reliable and debuggable than single-agent systems
Splunk's SPL is expressive enough to support complex geo-spatial queries once you understand field type handling
Privacy-first design isn't just ethical — it's technically cleaner. SHA-256 hashing client-side eliminates an entire category of data breach risk

What's Next

Real Twilio escalation — A2P 10DLC registration for production SMS delivery
Splunk AI Assistant — Natural language queries on the guardian_ops index ("Show me all high-risk zones this week")
Named Cloudflare Tunnel — Persistent URLs that survive restarts
ShieldHer + Guardian Ops merger — Single unified deployment
Multi-city heatmap — Aggregate anonymous risk data across cities to identify systemic patterns

Technical Stack

Layer	Technology
Frontend/PWA	Next.js 14, TypeScript, Zustand, Tailwind
Database	Supabase (PostgreSQL + RLS + Edge Functions)
SIEM	Splunk Enterprise 10.4
MCP Integration	Splunk MCP Server v1.2.0 (Splunkbase 7931)
AI Models	Groq llama-3.3-70b-versatile
Alerting	Twilio SMS + WhatsApp (mocked)
Maps	Mapbox GL JS
Tunneling	Cloudflare Tunnel
Auth	Supabase Auth
Deployment	Supabase Edge Functions (Deno)

Repository Structure

guardian-ops/
├── architecture/
│   └── guardian_ops_architecture.svg    # Full system architecture diagram
├── lib/
│   └── guardian-emitter.ts              # Drop-in PWA event emitter
├── splunk-app/
│   └── guardian_ops/                    # Installable Splunk app
│       └── default/
│           ├── savedsearches.conf       # 5 alerts + 5 scheduled reports
│           └── data/ui/views/
│               └── guardian_ops_soc.xml # 6-panel SOC dashboard
├── supabase/
│   ├── functions/
│   │   ├── splunk-hec-forwarder/        # Webhook → Splunk HEC bridge
│   │   └── splunk-agent-loop/           # 3-agent MCP pipeline
│   └── migrations/
│       └── 20240614_guardian_ops_splunk.sql
├── README.md
└── LICENSE

Built With

claude
groq
mapbox
mcp
next.js
supabase
twilio
typescript