Inspiration
Splunk MCP Server, AI Assistant, and the new agent-friendly Splunk AI surfaces unlock a great new class of operational agents — incident investigators, security-triage bots, capacity-planning copilots. The thing that breaks them is the same thing that breaks every agent loop: when something goes wrong in production at 2 AM, you have no idea which call regressed, how much that retry cost, or why the agent kept retrying a Splunk search that was already returning empty.
GeminiLens is the missing observability layer for Splunk-driven agents.
What it does
GeminiLens wraps any Gemini or Splunk-AI-backed agent and, for every model + tool call, records:
- wire-level traces (full request/response, redacted)
- per-call USD cost (input/output/cache tokens × model price)
- retry count + retry reasons
- latency p50 / p95 / max per step
- JSONL audit log for compliance and post-mortems
A bundled Streamlit dashboard renders it all in one glance. Drill into any step, see exactly what the agent saw, what it said back, how long it took, what it cost.
For the Splunk Agentic Ops Hackathon, the Observability track ask is "tools that help organizations monitor systems smarter". GeminiLens is the missing layer that turns an opaque Splunk MCP agent run into an inspectable run.
How we built it
GeminiLens core (PyPI: geminilens):
- An
httpxtransport plug-in that intercepts every Gemini API call without changing your agent code - A per-call cost calculator that knows the current Gemini pricing matrix (including 2026 cache-discount pricing)
- A JSONL audit log writer with deterministic redaction
- A Streamlit dashboard that reads the JSONL and renders per-host, per-tool, per-model breakdowns
Splunk integration shape:
import geminilens
import google.generativeai as genai
from splunk_mcp import SplunkClient
geminilens.attach(audit_path="runs/splunk-ops.jsonl")
splunk = SplunkClient(host="...", token="...")
def search_splunk(spl: str) -> list[dict]:
return splunk.search(spl)
# any Gemini call from now on flows through GeminiLens
model = genai.GenerativeModel("gemini-2.5-flash")
response = model.generate_content(
[
"You are an SRE. The user just reported a 500 spike on /checkout. "
"Search Splunk for relevant errors and summarize.",
],
tools=[search_splunk],
)
Every Gemini call + Splunk tool call lands in the JSONL audit log with full cost + latency, viewable in the dashboard.
Challenges we ran into
The hard part wasn't capturing telemetry — it was making it useful at 2 AM. Three things ate time:
- Redaction without losing debuggability. Splunk search queries can contain PII (
username=alice@). Default redaction strips too much; no redaction is unsafe. Settled on a deterministic placeholder map:alice@acme.com→<email_001>consistent across the whole run, so the trace stays followable. - Cost calc with 2026 Gemini cache pricing. The cached-input discount tier landed mid-build. Added a small unit table + per-call lookup.
- Streamlit refresh without re-loading the whole JSONL. Switched to incremental tail-read so a long-running session stays responsive in the dashboard.
Accomplishments
- Working GeminiLens on PyPI (
pip install geminilens), MIT licensed, in production use against my own agent runs - Streamlit dashboard with per-host, per-tool, per-model rollups
- Clean Splunk MCP integration shape — drops in around any
SplunkClient.search()call without touching the agent - Cost calculator covers Gemini 2.5 Flash, 2.5 Pro, and the cache-pricing tier
What we learned
The cheapest 10x for Splunk agent reliability is boring: a flat JSONL of every call. Once you have that, regressions become a diff. Cost overruns become a sorted-by-USD list. Latency spikes become a histogram. The agent code doesn't have to change.
What's next for GeminiLens for Splunk Agentic Ops
- Dedicated Splunk MCP middleware (in progress; small wrapper around
splunk_mcp.SplunkClientthat emits structured tool-call events) - A "send back into Splunk" exporter — write the GeminiLens JSONL as Splunk events so SREs can use Splunk to query their own agent runs
- A
geminilens-snapshotmode that fails CI when the agent regresses on a fixed Splunk-search scenario

Log in or sign up for Devpost to join the conversation.