Sentinel-AI

Inspiration

AI agents are increasingly being deployed in production environments to automate complex, multi-step tasks. But there's a critical problem: agents fail silently. They loop endlessly, drift away from their original goals, produce incoherent tool calls, and express low confidence — all without any built-in mechanism to detect or recover from these failures. We watched agent after agent go off the rails during our own experiments, and realized there was no "safety net" for agentic systems. That's what inspired SentinelAI.

What it does

SentinelAI is a real-time monitoring and intervention layer for AI agents. It wraps around any ReAct-style agent loop and:

Scores every step for risk on a scale of 0.0–1.0 using a weighted multi-factor algorithm
Detects 4 failure types automatically: infinite loops, goal drift, low confidence, and tool incoherence
Intervenes adaptively using 4 strategies matched to failure severity: reprompt, rollback, goal decomposition, and halt
Communicates securely across devices and networks via the Tailbridge layer (powered by Tailscale)
Visualizes everything in a real-time React dashboard (Webguide)

How we built it

SentinelAI is built in three layers:

Core Risk Engine (Python): We implemented a custom weighted risk scoring algorithm — Loop Detection (35%), Goal Drift (30%), Confidence Scoring (20%), and Tool Coherence (15%). Semantic drift is measured using word overlap analysis between the original goal and each agent step. Confidence is inferred through hedge-word detection (e.g., "maybe", "unsure", "possibly"). Each step is annotated with failure type tags and stored in an intervention history for full auditability.

Tailbridge (Distributed Agent Communication): Built on top of Tailscale, Tailbridge enables secure agent-to-agent (A2A) communication via taila2a — a topic-based pub/sub system with phone-book agent discovery and buffer-triggered activation. TailFS adds chunked, resumable, end-to-end encrypted file transfer across the tailnet.

Webguide Dashboard (React + TypeScript): A Material-UI powered real-time dashboard for monitoring agent status, risk scores, intervention history, and file transfer progress.

The agent reasoning itself is powered by the Gemini API (Flash for fast tool selection, Pro for deep reasoning), giving SentinelAI natural language understanding for context-aware interventions.

Challenges we ran into

Semantic drift is hard to measure in real time. We iterated through several approaches before settling on word-overlap cosine similarity, which balances accuracy and speed for live agent monitoring.
Tailscale ACL configuration for restricting cross-agent communication required careful design to avoid locking out legitimate agent traffic.
Balancing intervention aggressiveness — triggering too early disrupts productive agents; triggering too late allows cascading failures. Tuning the weighted thresholds took many test runs.
State management across rollbacks — preserving the correct execution checkpoint while discarding corrupted steps required careful design of the stateful execution manager.

Accomplishments that we're proud of

A fully working agentic loop with live risk scoring and 4 intervention strategies
A novel weighted multi-factor failure detection algorithm that works across different agent task types
Secure, production-ready multi-agent communication over Tailscale with zero-trust networking
A clean, real-time monitoring dashboard that makes agent internals visible and auditable

What we learned

Building SentinelAI deepened our understanding of where and why agents fail. We learned that most agent failures aren't random — they follow detectable patterns. We also gained hands-on experience with Tailscale's networking model, Gemini's multi-model API, and the challenges of building stateful systems that need to recover gracefully from partial failures.

What's next for SentinelAI

Backboard integration for persistent semantic memory across agent sessions, enabling SentinelAI to learn from past failures and pre-empt known failure patterns
Voice alerts via ElevenLabs — audio notifications when risk scores breach critical thresholds
Auth0 authentication for the Webguide dashboard to secure access to sensitive agent telemetry
Open-source SDK so any developer can wrap SentinelAI around their existing agent stack in minutes