What's next for Sentinel AI

🛡️ SentinelAI: Agentic Edge Surveillance with Gemini 3

💡 Inspiration

The inspiration for SentinelAI came from a fundamental gap in modern public safety systems: passive surveillance that only records, never prevents.

I witnessed this firsthand when a local convenience store was robbed, and despite having multiple CCTV cameras, the incident was only discovered hours later during a routine review. The security footage captured everything, but did nothing to prevent it or alert anyone in real-time. This got me thinking:

What if surveillance systems could think, reason, and learn like a security expert?

Traditional AI surveillance systems suffer from three critical flaws:

High false positive rates (60-80%) that lead to alert fatigue
No contextual understanding - they detect patterns but can't reason about them
Static learning - they never improve after deployment

When I learned about Gemini 3's multimodal reasoning and agent orchestration capabilities, I realized this was the missing piece. Not just another object detection model, but a thinking system that could:

Understand why something is suspicious
Explain its reasoning in natural language
Make intelligent decisions about escalation
Learn and improve from every interaction

SentinelAI represents my vision for intelligent, privacy-first surveillance that respects human dignity while enhancing public safety.

🎓 What I Learned

This project was an incredible learning journey across multiple domains:

1. Agentic AI Architecture

I learned to design true multi-agent systems where each agent has a specialized role:

Moving beyond simple prompt engineering to orchestrated reasoning chains
Understanding how to break complex decisions into agent-specific tasks
Implementing feedback loops where agents learn from each other

The mathematical insight: An agent system's collective intelligence $ I_{collective} $ can exceed individual capabilities:

$$ I_{collective} = \sum_{i=1}^{n} w_i \cdot I_i + \alpha \cdot \prod_{i,j} \text{synergy}(A_i, A_j) $$

where $ w_i $ are agent weights, $ I_i $ is individual intelligence, and $ \alpha $ represents synergistic interactions.

2. Edge-Cloud Hybrid Architecture

Balancing privacy and intelligence required careful design:

Edge processing for real-time detection (30 FPS on CPU)
Selective cloud reasoning - only key frames sent to Gemini
Privacy preservation: $ \text{data_transfer} = O(\log n) $ instead of $ O(n) $ for full video

3. Self-Improving Systems

Implementing the learning loop taught me about:

Online learning from user feedback
Threshold optimization using gradient-free methods
Accuracy improvement curves: After $ k $ feedback iterations, accuracy improves as:

$$ \text{Accuracy}(k) = \text{Baseline} + \beta \cdot (1 - e^{-\lambda k}) $$

where $ \beta $ is max improvement and $ \lambda $ is learning rate.

4. Production-Grade Frontend Development

Creating a truly distinctive UI meant:

Moving beyond template aesthetics to intentional design choices
Implementing smooth animations with pure CSS
Building responsive, accessible interfaces
Understanding the psychology of security dashboards (why red/blue/green matter)

5. Computer Vision Pipeline Optimization

I learned to optimize the detection pipeline:

YOLOv8n for lightweight object detection (6MB model, 30 FPS)
MediaPipe Pose for behavior analysis without sending video to cloud
Temporal analysis for motion pattern recognition
Achieving real-time performance: $ \text{latency} < 33\text{ms} $ per frame

🔨 How I Built It

Architecture Overview

┌─────────────────────────────────────────────────────────┐
│                   Edge Device Layer                      │
│  ┌─────────┐  ┌──────────┐  ┌────────────┐             │
│  │ YOLOv8  │→ │MediaPipe │→ │  Motion    │→ Suspicion  │
│  │Detection│  │   Pose   │  │  Analysis  │   Score     │
│  └─────────┘  └──────────┘  └────────────┘             │
└─────────────────────┬───────────────────────────────────┘
                      │ if score > threshold
                      ▼
        ┌─────────────────────────────┐
        │   Key Frames + Metadata     │
        └─────────────┬───────────────┘
                      ▼
┌─────────────────────────────────────────────────────────┐
│              Gemini 3 Agent Layer                        │
│                                                          │
│  Agent 1 (Reasoning) → Agent 2 (Explanation)            │
│         ↓                      ↓                         │
│  Agent 3 (Policy)   ←  Agent 4 (Learning)               │
│                                                          │
└─────────────────────┬───────────────────────────────────┘
                      ▼
              Alert + Explanation
                      ▼
            Streamlit Dashboard
'''

Built With

python

Submitted to

Gemini 3 Hackathon

Created by

Ismail Sadykov
Ammar Ali
I am a Computer Vision Engineer. I transform challenging vision problems into practical, scalable solutions.
Kashyapi Kuhu
Eesha Tariq
Asma Zubair
Fahamgeer Mahesar
Python Developer | ML | DL | CV | NLP | LLMs | GenAI | Agentic AI | AI Agents | Data Analyst | Building Future-Ready AI Systems

Updates

Eesha Tariq started this project — Feb 09, 2026 03:59 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.