GeminiLens

Inspiration

Every Gemini project I built last year ended up with the same four problems once it left my laptop. I couldn't tell which prompt was burning my Vertex AI bill. I couldn't see when p95 latency started creeping up after a release. I couldn't notice when the model's output got noticeably longer or noisier. And I couldn't audit which external hosts my agent's tools had actually called. There are tools for each piece, but they either lock you into a hosted backend or ask you to install a giant APM agent. GeminiLens is the smallest thing that solves all four, locally, before you have to commit to a vendor.

What it does

GeminiLens wraps any Vertex AI Gemini client and produces a Trace record per call with prompt, response, token counts, latency, USD cost, and a list of tool invocations. It includes a rolling-vs-baseline drift report so you can see when latency, cost, or output length is shifting. An httpx-based egress allowlist enforces that agent tools can only reach approved hosts. A Streamlit dashboard renders the traces with live metrics, drift cards, and a timeline. For production, an optional Dynatrace exporter pushes every trace as a structured log event with full gen_ai.usage.* semantic conventions.

How I built it

google-genai for Vertex AI Gemini 2.5 calls
Streamlit + pandas for the dashboard
httpx custom transport for the egress allowlist
Pure stdlib for cost math and drift, so the math is reviewable
pytest covering cost, observer, guard, Azure adapter, and Dynatrace exporter

The Gemini cost table is hand-curated from Google's published pricing and is checked into the repo so reviewers can audit it without leaving GitHub. There's also an Azure OpenAI adapter that wraps the same Trace shape for projects that mix Gemini and Azure OpenAI.

Challenges I ran into

google-genai's usage_metadata shape varies between client versions and between Vertex AI and the public Gemini API. The observer handles both. Drift on a small trace history is noisy, so I expose the window sizes and sample counts explicitly in the report rather than hiding them. Streamlit's WebSocket-driven rendering meant headless Chrome screenshots needed virtual-time budgets to capture a populated dashboard.

Accomplishments that I'm proud of

19 passing tests covering the full public API
Cost calculator returned $0.000027 for a real 40-in/6-out gemini-2.5-flash call, matching the published price table to the rounding penny
Auto-seed on cold dashboard load so a reviewer opening the URL fresh sees a populated UI with realistic drift cards
Self-observation: GeminiLens can wrap itself, recording the cost of its own demo runs

What I learned

The OpenInference and OpenTelemetry GenAI semantic conventions are still diverging. Picking conservative attribute names (gen_ai.usage.input_tokens) that work in both worlds keeps the Dynatrace exporter future-proof.

What's next for GeminiLens

Multi-process trace stitching for distributed agents
Vector-store retrieval drift signal
One-click Arize Phoenix export
TrueFoundry exporter for the LLM observability sponsor track on this hackathon

Built With

azure-openai
dynatrace
gemini
gemini-2.5
google-genai
httpx
llm-observability
opentelemetry
pandas
python
streamlit
vertex-ai

Updates

Mukunda Katta started this project — May 18, 2026 02:38 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.