Inspiration
Cybersecurity tools have a dirty secret: they catch threats but can't explain them. Our team watched security analysts drown in black-box alerts — "87% malicious" with zero context on why. Worse, the rise of AI systems introduced an entirely new attack surface — prompt injection — that no existing tool addressed. We built SentinelIQ to fix both problems: real detection and real explainability.
What it does
SentinelIQ is a real-time AI security platform with four independent threat engines running simultaneously:
- Phishing Engine — Fine-tuned BERT (65%) blended with 9 weighted heuristic signal categories (35%). Detects urgency-threat combos, credential harvesting, executive impersonation, and more. Gemini 2.0 Flash acts as a tertiary override for ambiguous cases.
- URL Analysis Engine — Same BERT model blended with 12 structural URL signals. Raw IPs score 1.0, brand spoofing scores 0.92, suspicious TLDs score 0.80, with multi-signal boosting for stacked indicators.
- Prompt Injection Engine — Pattern matching across 7 attack categories: direct overrides, homoglyph Unicode, role hijacking, token smuggling, and data exfiltration attempts — with Gemini as the final judge.
- Anomaly Detection Engine — Custom Isolation Forest (300 trees, 3% contamination) trained on 11 behavioral features including login velocity, geographic distance, device fingerprint, and privilege escalation flags.
Every result ships with SHAP feature attribution — so instead of a score, analysts get the exact signals that drove it.
How we built it
We designed a 3-layer cascade pipeline to keep latency low and costs near zero:
- Layer 1 — Pattern Matching (~0ms): Catches ~70% of obvious attacks instantly with zero API calls
- Layer 2 — BERT Neural Model (100–300ms): Handles ambiguous content with a 65/35 BERT-heuristic blend
- Layer 3 — Gemini 2.0 Flash (500–2000ms): Only fires on the ~2% of edge cases where BERT confidence < 0.55
Backend: FastAPI · PyTorch 2.3 · HuggingFace Transformers · scikit-learn · SHAP · Google Gemini 2.0 Flash · PyMuPDF · BeautifulSoup4 · Firebase Firestore
Frontend: React 18 · TypeScript · TailwindCSS v3 · Framer Motion · Recharts · Firebase Auth · Firestore onSnapshot for real-time updates
Infrastructure: Vercel (frontend) · Render (backend) · Firebase · GitHub with auto CI/CD on push
Challenges we ran into
Blending neural + heuristic signals without losing either. Getting the 65/35 BERT-heuristic weight to actually improve over either approach alone required careful calibration — too much BERT and rare-but-obvious heuristic patterns get diluted; too much heuristic and the model loses generalization on novel phishing language.
The confidence floor problem. When a Tier-1 signal fires (e.g. "account will be locked + verify immediately"), the BERT model would sometimes still return moderate confidence on benign-seeming phrasing. We implemented a hard confidence floor of 0.80 for critical pattern matches — if the pattern fires, the score cannot be diluted below that threshold regardless of model output.
Absence scoring. Teaching the system to flag what's missing — no personal name, no brand signature, no unsubscribe link — rather than only what's present, required rethinking the feature engineering entirely.
Prompt injection is a new frontier. There's no established labeled dataset for prompt injection at scale. We built the detection engine from scratch using pattern categories derived from known jailbreak and adversarial research, with Gemini serving as a semantic safety net.
Accomplishments that we're proud of
- 4 engines, 1 unified composite score — Phishing (30%) + Prompt Injection (28%) + URL (25%) + Anomaly (17%), blended with Confidence (40%), Severity (30%), SHAP density (20%), and historical threat rate (10%)
- SHAP explainability on every alert — the first security platform in this class to ship per-signal attribution as a core feature, not an afterthought
- Gemini invoked on <2% of requests — the cascade architecture makes the expensive LLM call genuinely rare while keeping accuracy high
- Prompt injection detection built-in — covering direct overrides, Unicode homoglyphs, role hijacking, encoded attacks, and social engineering attempts
- Real-time frontend — Firestore
onSnapshotstreams threat updates live without any polling
What we learned
Ensemble design is harder than it looks. Making four independent models agree on a composite score — without letting one engine dominate or cancel out another — required us to think carefully about weighting philosophy, not just accuracy. We also learned that explainability isn't a feature you bolt on at the end; SHAP had to be part of the architecture from day one, which changed how we structured our signal pipeline significantly.
What's next for SentinelIQ
- Email gateway integration — SMTP-level scanning so threats are caught before they reach inboxes
- Browser extension — real-time URL and page content analysis as users browse
- Fine-tuning on organization-specific data — letting enterprises train the anomaly engine on their own baseline behavioral patterns
- XGBoost composite scoring — replacing the current weighted average with a learned ensemble model for the final risk score
- Threat intelligence feed — aggregating flagged indicators across deployments (with privacy preservation) to improve detection rates across the network
Log in or sign up for Devpost to join the conversation.