Inspiration
What it does
How we built it
Challenges we ran into
Accomplishments that we're proud of
What we learned
What's next for Splunk IR Triage Agent
Inspiration
Traditional SOAR systems run static playbooks: alert -> predefined steps -> action. They handle known patterns well but break on alert variants, multi-stage incidents that need entity context, and ambiguous alerts where "investigate vs suppress" is a judgment call. When the playbook can't decide, the analyst gets paged on every alert and queue fatigue sets in.
We built an IR triage agent that thinks like a tier-1 SOC analyst, autonomously pulls context from Splunk via MCP, and emits a structured triage card with explicit uncertainty flags. The differentiator is honesty: when data is sparse the agent says so on the card instead of fabricating findings.
What it does
On alert fire:
- Receives the alert payload (search name, SID, SPL, matching event fields, time, owner, app).
- Autonomously queries the Splunk MCP Server: surrounding events for any host/user/process named in the alert, historical firings of the same alert signature, knowledge-object lookups (saved searches, alerts).
- Emits a strict JSON triage card: classification, severity, entity_context, historical_pattern, recommended_action (escalate / contain / investigate / suppress), reasoning, confidence (0.0–1.0), and an explicit uncertainty_flags array.
The discrete recommended_action values mean downstream automation (PagerDuty, ServiceNow, ticketing) can wire the card in without a human pre-filter.
How we built it
- Splunk Enterprise 10.4.0 running locally (60-day trial).
- Splunk MCP Server v1.1.3 from Splunkbase (app #7931) — 10 tools at
https://<host>:8089/services/mcp:splunk_run_query,splunk_get_indexes,splunk_get_index_info,splunk_get_metadata,splunk_get_knowledge_objects,splunk_run_saved_search,splunk_get_info,splunk_get_user_info,splunk_get_user_list,splunk_get_kv_store_collections. - Gemini 2.5 Flash via Vertex AI as the agent brain, using function calling. The agent loop in
agent.pydynamically translates MCP tool schemas to Gemini function declarations, so adding new MCP tools requires no code changes. - Python 3.11+ for orchestration; direct streamable-HTTP to Splunk's
/services/mcpendpoint (nomcp-remoteproxy needed).
The triage system prompt in triage.py defines the JSON output schema and budget guardrails: max ~6 tool calls, conclude with low confidence + uncertainty flags when data is sparse, never invent fields not seen in the data.
Challenges we ran into
- Splunk MSI on non-ASCII Windows hostname:
serverNamevalidation rejects hostnames that are non-ASCII or contain dashes. Required renaming the host + clean reinstall. - Gemini AI Studio free tier quota:
gemini-2.5-flashhit 503 high-demand andgemini-2.0-flashhit 429 daily quota within a handful of dev runs. Switched to Vertex AI for stable capacity. - JSON Schema dialect mismatch: Splunk MCP tool schemas include keys (
pattern,examples) that Gemini's function declaration parser rejects. We strip unsupported keys in_clean_schema.
Accomplishments that we're proud of
- End-to-end agent run on two sample alerts in under three turns each, valid JSON output, sensible severity + action + confidence + uncertainty flags.
- Zero hallucination on sparse-data runs: when Splunk returns no rows, the triage card says so on the card with explicit flags instead of inventing entity history.
- Strict JSON output schema downstream automation can consume without parsing free text.
What we learned
LLM-on-SOC demos usually impress by being bold ("Critical attack detected, isolate host!"). The harder, more useful pattern is calibrated confidence with explicit uncertainty surfacing. A SOC analyst can trust confidence=0.4 with three uncertainty flags more than confidence=0.95 with a confident lie.
What's next for Splunk IR Triage Agent
- Wire the agent into Splunk's alert action framework so it triggers automatically on saved-search alerts, attaching the triage card to the modmail-style conversation in Splunk Web.
- Add a Splunk app wrapper so the agent installs as a Splunkbase app rather than a separate Python process.
- Extend recommended_action to suggest specific SOAR playbook IDs for the
containpath.
Built With
- context
- mcp
- model
- splunk
- vertex
Log in or sign up for Devpost to join the conversation.