GroundTruth is a research-oriented observability and evaluation platform designed for LLM researchers and retrieval engineers to analyze, diagnose, and optimize Retrieval-Augmented Generation (RAG) systems.
Built for the Arize @ Google Cloud Partnerships Hackathon, the project combines Arize Phoenix tracing, OpenInference instrumentation, retrieval diagnostics, context efficiency analysis, long-context evaluation, and adaptive optimization to help researchers investigate how retrieval configurations impact grounding quality, latency, hallucination risk, token efficiency, and overall retrieval performance in modern large language model pipelines.
The platform traces prompts, retrieved chunks, answers, latency, token usage estimates, grounding scores, and hallucination risk using Arize Phoenix and OpenInference. Through Phoenix MCP integration, Gemini CLI can directly introspect runtime traces and retrieval behavior.
GroundTruth also includes a retrieval failure intelligence system that detects noisy retrieval, redundant context, retrieval omission, weak grounding, and hallucinated synthesis, then generates optimization recommendations and adaptive next retrieval configurations.
To support long-context and inference-efficiency research, the platform measures context efficiency ratios, redundant retrieval chunks, degradation risk, and context dilution across retrieval settings.
The system includes a self-improvement optimization loop that analyzes evaluation and observability signals from previous experiments to iteratively recommend retrieval improvements such as adjusting chunk size, modifying top-k retrieval, reducing noisy context, and applying stricter grounded prompting.
Experiment results are tracked and compared through an experiment dashboard that allows researchers to benchmark grounding, latency, retrieval efficiency, and adaptive optimization behavior across multiple retrieval configurations
- Offline Research Mode (open-source testing)
Demo link: https://youtu.be/JSuyk-zalSE?si=Hy1pmXoPaW0CBxQf
- Live Gemini + Phoenix Mode for Google Cloud Rapid Agent Hackathon

Log in or sign up for Devpost to join the conversation.