Inspiration

When production breaks, a developer needs two kinds of intelligence at once: operational context (what Splunk sees — errors, metrics, anomalies) AND code context (which file, which method, what changed). Today those live in two tools — the Splunk dashboard and the IDE — and you tab-switch for ~13 minutes per incident. Splunk knows the symptom ("47 errors at line 47, 8× baseline") but not which file owns it; the IDE knows the code but not which error is firing right now.

What it does

DualContext answers both with one question. It fans a natural-language query out — in parallel — to two MCP servers: the Splunk MCP Server (operational reality) and the SigMap MCP Server (code structure). A Splunk-hosted model (gpt-oss-120b) fuses both into a single grounded answer that cites a specific log entry AND a specific file+method, then scores its own groundedness (0–1, PASS/FAIL). On the demo incident it pinpoints JwtTokenProvider.validateToken() (line 47) and the matching Splunk error spike in ~1 second.

How we built it

  • Splunk MCP Server (Splunkbase App 7931): operational context via the real run_splunk_query tool over MCP/JSON-RPC. Errors, error-rate timechart, and alert history are all SPL.
  • SigMap MCP Server (v7.0.0): code context via query_context — ranked file/method signatures at ~99.6% token reduction, over stdio JSON-RPC.
  • Synthesis + judge: a Splunk-hosted model (gpt-oss-120b) fuses the two contexts and scores groundedness (LLM-as-judge).
  • Orchestration: a Python agent runs both MCP queries concurrently for a real wall-clock win.
  • Analytics: investigation telemetry flows to Splunk via HEC into a Dashboard Studio dashboard — verified live in Splunk Enterprise 10.4.

Challenges we ran into

  • The Splunk MCP Server exposes one query tool (run_splunk_query), not separate metrics/alerts tools — so all operational signals are expressed as SPL.
  • Discovering the real tool names: SigMap's tool is query_context, and it has no groundedness-judge tool — so the Splunk hosted model scores groundedness instead. We verified every tool name against the live servers rather than assuming.
  • Making it runnable for judges with zero setup: the whole pipeline runs offline in demo mode, and an e2e test exercises the live SigMap MCP server.

Accomplishments that we're proud of

  • A genuinely composed dual-MCP architecture — two MCP servers behind one agent — the most architecturally sophisticated way to use the Splunk MCP Server as an interop layer.
  • Every answer carries a groundedness score, so trust is visible.
  • Verified end to end, including a live Splunk Enterprise 10.4 dashboard driven by real HEC events (no hardcoded numbers).

What we learned

Operational and code context are complementary, not interchangeable. MCP makes composing independent context sources clean. And grounding answers in retrieved code structure (vs general knowledge) takes groundedness from 0.15 to 0.83.

What's next for DualContext

Auto-trigger on Splunk alerts (PagerDuty/Slack), more SigMap tools (get_impact for blast-radius), multi-repo support, and a one-click "open the fix in your IDE" action.

Built With

  • dashboard-studio
  • gpt-oss-120b
  • mcp
  • model-context-protocol
  • python
  • sigmap
  • spl
  • splunk
  • splunk-hec
  • splunk-mcp-server
Share this project:

Updates