Sentinel - Self-Governing Fraud-Decision Agent

What it does

Sentinel is a guardian system that supervises a fraud-detection model. It watches the model's decisions in real time, detects when its behavior drifts, uses Gemini to diagnose the cause, recommends a fix, and applies it only after a human approves.

The loop closes in six steps:

A trained fraud model scores each transaction — approve, escalate, or decline.
Every decision streams to Arize for observability.
A monitor tracks the decline rate and trips when it leaves the normal band.
Gemini investigates: targeted attack, or broad model drift? It explains why.
It recommends a concrete threshold change with a predicted effect.
A human approves, the fix is applied, and the decline rate recovers.

Inspiration

Fraud models don't fail loudly — they rot quietly. A model that works today slowly degrades as fraud patterns shift, and the catch is a feedback delay: you don't learn an approval was fraud until a chargeback arrives weeks later. By then the damage is done, and the opposite failure — wrongly declining real customers — is completely invisible.

So we stopped trying to watch the outcome and started watching the model's own behavior. A sudden change in the decline rate is an early signal you can see in minutes, long before any chargeback exists. That single idea — supervise the AI's behavior, not its delayed ground truth — is the whole project. And it generalizes: the same guardian could watch any high-stakes decision model.

How we built it

Decision model — a scikit-learn classifier trained on a real public credit-card fraud dataset (~1.3M transactions), scoring on category, amount, and time of day.
Observability — every decision is logged to Arize AX as an OpenTelemetry trace via arize-otel.
Investigation — Gemini 3 Flash on Google Cloud (Vertex AI) reasons over the live evidence and produces a human-readable diagnosis.
Guardian loop — Python orchestration: monitor → detect → investigate → recommend → human approval → apply → recover.
Interface — a Flask server streams the live loop to a web UI (live chart, agent-reasoning feed, and a real Approve button) over Server-Sent Events.
Deployment — containerless deploy to Render; runs live end to end.

The architecture has a clean seam between the loop and the integrations, so the whole system runs in a no-credential simulation mode and switches to live Arize + Gemini by flipping environment variables — nothing else changes.

Challenges we ran into

Making the drift real, not staged. Instead of a fake "drift = true" switch, we inject a surge of real fraud transactions concentrated in one category, so the model genuinely misbehaves and the decline rate climbs on its own.
Observability latency. Arize has an ingestion delay, so a record you just logged isn't immediately queryable. We log every decision to Arize for the dashboard while computing the fast loop signal in-memory from the same decisions.
Calibrating a real model. A trained classifier behaves differently from a toy one, so detection and recovery thresholds had to be tunable rather than hard-coded.
SDK reality. Gemini 3 preview is served only from the global Vertex endpoint, and the current Arize SDK sends data as traces rather than the older inference logger — both required adapting to the live APIs.

What we learned

The hard part of trustworthy AI isn't the model — it's knowing, in real time, when to stop trusting it.
Catching a leading behavioral signal beats waiting for lagging ground truth.
Autonomy and human control aren't opposites: the agent does all the detection and diagnosis, and the human makes only the one consequential decision.

What's next

Move the alarm fully into Arize monitors.
Point the same guardian at other decision systems (credit, content moderation, LLM agents).
Support multiple simultaneous drift signals beyond the decline rate.

Built with

Python · scikit-learn · Gemini 3 (Vertex AI) · Arize AX · Flask · Render

Built With

arize
chart.js
css
flask
gemini
google-cloud
html
javascript
opentelemetry
pandas
python
scikit-learn
server-sent-events
vertex-ai

Updates

Harshitha Sivalingala started this project — Jun 11, 2026 04:53 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.