LLM Guardrail AI

DASHBOARD

Inspiration

As large language models move from experimentation into production systems, a major gap becomes clear: while we can monitor infrastructure and APIs, we lack reliable ways to observe, measure, and enforce trust in AI behavior. Issues like hallucinations, prompt injection attacks, and unexpected cost spikes often appear only after users are affected.

LLM Guardrail AI was inspired by the need to treat AI systems like first-class production services—with measurable health, clear failure signals, and actionable remediation—rather than black boxes.

What it does

LLM Guardrail AI is a production safety and observability layer for Gemini-powered applications. It continuously evaluates every LLM request, assigns a real-time LLM Trust Score, and streams structured telemetry to Datadog.

The system detects three high-impact risk categories:

Hallucination Risk – abnormal output entropy and response instability
Security Risk – prompt injection and jailbreak patterns
Cost Risk – sudden token usage anomalies

When risk thresholds are exceeded, Datadog automatically raises incidents, correlates signals, and surfaces actionable context for engineers to respond.

How we built it

The application is composed of a frontend control panel and a backend trust enforcement service.

Frontend:
A React-based dashboard that visualizes the live LLM Trust Score, active guardrail violations, and recovery status using clear color-coded states.
Backend:
A Gemini-powered service running on Google Cloud Vertex AI. Each request is evaluated by a trust and risk engine that computes hallucination, security, and cost signals before calculating the overall Trust Score.
Observability:
Instead of using a Datadog Agent, the backend streams logs, metrics, and events directly to Datadog via APIs. Custom monitors detect guardrail violations and trigger incidents with root-cause summaries and remediation steps.

Authentication to Vertex AI is handled using Google Cloud Application Default Credentials, keeping secrets out of the frontend and aligning with production best practices.

Challenges we ran into

Defining trust in a measurable way:
Translating probabilistic LLM behavior into a deterministic, explainable Trust Score required careful balancing of simplicity and usefulness.
Environment limitations:
Our runtime environment did not support the Datadog Agent, so we implemented direct API-based telemetry while preserving full observability.
Keeping the MVP focused:
We intentionally limited the system to three guardrails to avoid noise and ensure clear, reproducible signals. ## Accomplishments that we're proud of
Designed a clear, mathematical LLM Trust Score that updates in real time
Built a fully agentless Datadog integration using logs, metrics, and incidents
Created reproducible guardrail violations that demonstrate real production risks
Delivered an end-to-end system that moves from detection to remediation ## What we learned
Observability for AI systems requires new signals beyond latency and uptime
Correlating multiple weak signals is more effective than relying on a single metric
Actionable context is critical—alerts without guidance slow down response
Production-ready AI systems need trust to be continuously monitored, not assumed ## What's next for LLM Guardrail AI Next, we plan to expand guardrails to include model drift and response quality trends, integrate automated remediation workflows, and support multiple LLM providers beyond Gemini. Long-term, LLM Guardrail AI could evolve into a standardized trust layer for deploying AI safely at scale.

Built With

react
supabase
tailwind

Updates

Andrew Omwenga started this project — Dec 29, 2025 06:06 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.