Inspiration

Most RAG systems break quietly. The retrieval gets a little worse, the model starts paraphrasing more, citations stop matching the source, and nobody notices until a user complains. We wanted an agent that watches the system itself and acts before that complaint lands.

What it does

ragvitals is a Vertex AI / Gemini powered agent that monitors a production RAG pipeline across five drift dimensions:

  1. Data drift: distribution shift in the source corpus
  2. Embedding drift: shift in the vector space (MMD with RBF kernel, sliced Wasserstein)
  3. Response drift: shift in answer style and length
  4. Confidence drift: shift in the model's self-rated certainty
  5. Query drift: shift in the kinds of questions users are asking

When any dimension crosses threshold, the agent does three things: explains which dimension drifted and why, recommends the next action (rebuild index, swap embedding model, rotate prompts), and writes a short incident summary to a Cloud Storage bucket the on-call engineer reads.

How we built it

The math lives in ragdrift, a Rust crate (also shipped as ragdrift-py on PyPI via PyO3 + maturin). Five-dimensional drift in one library with the Wasserstein and MMD math compiled to native code.

The agent layer is a thin Gemini wrapper that calls ragdrift on a schedule, interprets the scores, and produces the natural-language summary. It uses Gemini for the explanation and routing decision, ragdrift for the actual numbers.

Storage adapters cover OpenSearch, pgvector, and Pinecone. Metric exporters cover CloudWatch, Prometheus, and Datadog. Drop-in for whatever your team already runs.

Challenges

Getting the math to be both fast and explainable. The Wasserstein-1 implementation needed to handle the corner cases that surface in real corpora: ragged batch sizes, near-empty distributions during cold start, embedding-space rotation that looks like drift but is not. The Rust core handles that in ~3ms per check.

Accomplishments

  • Five drift dimensions in one library, not five separate tools.
  • Native code via PyO3, so Python users pip install ragdrift-py and never touch Rust.
  • Already on crates.io and PyPI at v0.1.3, MIT/Apache-2.0 licensed.
  • The Gemini agent layer is small enough to vendor into your own codebase if you want.

What we learned

Drift detection is the part of RAG ops that most teams skip until it bites them. Once you have it, you stop guessing whether a regression is real and start knowing.

What's next

Native Vertex AI Vector Search adapter (currently you can use the OpenSearch one). Vertex Model Monitoring integration so the drift scores show up in Vertex's own dashboards. A second agent that auto-applies the recommended fix in shadow mode and waits for human approval to promote.

Built With

Share this project:

Updates