SentinelAI - Real-Time Chat Security Agent

Inspiration

Cyber threats no longer arrive as obvious malware or exploits.
They arrive as messages — phishing emails, scam texts, impersonation attempts, and even prompt injection attacks targeting AI systems.

While building and experimenting with AI agents, we realized a critical gap:

Rule-based filters (regex, keywords) are too rigid to catch social engineering.
Raw LLMs are powerful, but too unpredictable to be trusted with security decisions.

As AI systems increasingly communicate with users — and with other AI agents — who protects the conversation itself?
That question inspired SentinelAI.

What It Does

SentinelAI is a real-time chat security agent that analyzes messages and enforces clear safety decisions:

ALLOW – Safe content
FLAG – Suspicious content
BLOCK – Dangerous content

Instead of acting like a chatbot, SentinelAI functions as an intelligent firewall for conversations, protecting users and AI systems from phishing, scams, malware, impersonation, and prompt injection attacks.

How We Built It

SentinelAI is built using a hybrid architecture that combines deterministic security logic with AI reasoning:

Deterministic Preprocessing
- Extracts URLs and detects urgency or authority signals
- Provides fast, reliable signals for analysis
Gemini 3 Reasoning Engine
- Gemini 3 is used purely for reasoning, not text generation
- It evaluates intent, social engineering patterns, and adversarial behavior
- Returns structured JSON output (threat type, confidence, reasoning)
Risk Scoring Engine
- Converts Gemini’s output into a numerical risk score (0–100)
Policy Engine
- Enforces deterministic actions (ALLOW / FLAG / BLOCK)
Memory & Escalation
- Repeated suspicious behavior automatically escalates risk
Audit Logging
- Every decision is cryptographically hashed (SHA-256)
- Stored in append-only logs for transparency and trust