-
-
SENTINEL-G live dashboard showing real-time LLM health signals (confidence, latency, diversity) and overall system status.
-
"Deterministic Failure Lineage: Unlike black-box monitoring, SENTINEL-G traces the failure root cause (t-12m) to (t+0).
-
LLM failure and recovery surfaced as operational events in Datadog, enabling incident-style response workflows.
-
"SENTINEL-G Titan Architecture: Google Vertex AI (Gemini 1.5 Pro/Flash) and Datadog HQ.
🛑 The Blind Spot LLM deployments fail silently. While most teams monitor infrastructure uptime, they ignore Cognitive Reliability. SENTINEL-G is a Financial Risk Engine that treats LLM hallucinations and latency not as UX bugs, but as auditable financial incidents. By converting technical telemetry into real-time dollar impact, we enable enterprises to manage AI with the same rigor as high-frequency trading systems.
The Reality: When an LLM hallucinates, it's not a "UX glitch"—it's a trust violation.
When latency spikes in a chat bot, it's not "lag"—it's lost conversions.
Most teams know that their AI failed, but they have no idea what it cost them or how to fix it automatically.
We stop treating AI failures like vague mysteries and start treating them like traceable, recoverable business liabilities.
🛡️ What It Does SENTINEL-G is a Reliability Control Plane that sits between your users and your AI models. Instead of passively logging errors, it actively manages the cognitive health of your application:
🔍 Detects Cognitive Failures: Uses deterministic rules (not vague prompts) to catch Hallucinations, Latency Anomalies, Cost Explosions, and Prompt Injection attacks.
💰 Prices the Risk: Instantly calculates the Dollar Impact of every failure (e.g., "This hallucination risks $93k in projected revenue loss").
📈 Beyond Monitoring: Traditional observability tells you that something is wrong. SENTINEL-G tells you how much it's costing you and streams this to Datadog via high-fidelity custom metrics.
🤖 Auto-Recovers: Executes surgical fixes without human intervention—toggling Vertex AI Grounding, switching between Gemini Pro and Flash, or blocking attackers—and verifies recovery.
The Result: Ops teams stop staring at "Error Rates" and start managing Revenue Risk.
⚙️ How We Built It (The "Titan" Architecture) We engineered a system that combines the cognitive power of Google Cloud Vertex AI with the mission-critical observability of Datadog.
- The Titan Backend (FastAPI + Python) 🌍 Global Telemetry Mesh: Since standard Datadog agents are unavailable in serverless environments, we engineered a custom, agentless HTTP telemetry pipeline. This streams metrics simultaneously to US and EU Datadog clusters, ensuring 100% observability.
🧠 Deterministic Classifier: A rule-based engine that identifies failure modes without relying on "black box" AI guesses.
⚖️ The Golden Ratio Metric: A proprietary algorithm that calculates the "Health" of an AI interaction based on (Confidence * Speed) / Cost.
- The Cognitive Layer (Google Vertex AI) Gemini 1.5 Pro: Used for complex reasoning and primary inference.
Gemini 1.5 Flash: Used for high-speed recovery and cost-optimized fallback.
Vertex AI Grounding: Dynamically enabled when the Risk Engine detects a drop in faithfulness.
- The Liquid Dashboard (React + Vite) A glassmorphism-styled command center that visualizes the Failure Lineage (T-12m timeline) and Financial Risk Breakdown in real-time.
🔥 The Counterintuitive Insight During development, we discovered that Latency spikes are not the most expensive failure. The real killer is Silent Confidence Drift. When a model's confidence slowly degrades from 90% to 70%, it doesn't trigger standard infrastructure alerts. Functionally, the model starts giving "mediocre" answers that kill conversion rates silently. SENTINEL-G is designed specifically to catch this "Silent Killer" before it impacts the bottom line.
🚧 Engineering Reality & Constraints Bypassing Agent Limitations: We reverse-engineered the Datadog Series API to ensure Custom Metrics and Events flow reliably from Render’s serverless architecture without a local daemon.
Determining "Truth": We avoid "AI monitoring theater" by using Proxy Signals—measuring vector similarity drops (RAG Faithfulness) and token log-probability variance to create a deterministic score.
🏆 Accomplishments We're Proud Of Fully Deployed & Production-Ready: This is not a concept or a mock. SENTINEL-G is a working reliability layer with a live, responsive frontend and a structured backend logging system.
Closed-Loop Remediation: We successfully implemented a system that not only detects a failure but executes a verifiable fix (e.g., model routing) in under 5 seconds.
Boardroom-Ready Observability: We bridged the gap between Engineering and the C-Suite by making AI reliability a financial metric.
🔮 What's Next Predictive Model-Aware Scaling: Using Datadog anomaly detection to preemptively switch to Gemini Flash before traffic-induced latency spikes occur.
Compliance Guard: Real-time PII redaction and automated GDPR violation cost estimation for enterprise deployments.


Log in or sign up for Devpost to join the conversation.