Inspiration
We wanted to build a real-time AI system that helps security operators detect and understand incidents instantly, with explainable alerts and instant video evidence — all without complex setup or heavy infrastructure.
What It Does
- Detects incidents in real time from live camera feeds using Google Gemini.
- Slices the stream into short MP4 clips and attaches the relevant snippet to each detected event.
- Streams events to the dashboard via Server-Sent Events (SSE) with concise, explainable summaries.
- Allows operators to preview clips instantly and review recent alerts.
- Provides backend APIs to start/stop monitoring and view system status.
How We Built It
- Backend: FastAPI service with lifecycle management, CORS, and a video pipeline that chunks streams into
video_chunks/. Event detection is handled by Gemini, with optional persistence to PostgreSQL whenDATABASE_URLis configured. - Realtime: SSE endpoint streams new events to clients without polling.
- Frontend: Next.js (App Router) app that renders the live monitoring UI, subscribes to SSE, and plays event-specific clips.
- Data Model: Each event includes timestamp, event code, description, explainability text, and clip URL.
- Operations: Simple endpoints —
/start,/stop,/status,/events,/events/id/{event_id}, and/video?filepath=.... - Latency Optimization: We target sub-5s perceived delay by overlapping video chunking and analysis. If chunk size =
Land analysis time =A, total delay ≈max(L, A) + δ(network/IO). We tuneLto balance quality, latency, and cost.
Challenges
- Reducing latency while maintaining consistent video quality and timely event detection.
Accomplishments
- Achieved a clean end-to-end flow from camera to explainable alert with one-click clip playback.
- Implemented smooth, real-time updates using SSE with client-side deduplication.
- Designed a graceful service lifecycle with safe startup/shutdown and optional database persistence.
What We Learned
- In operator workflows, clarity beats volume — short, high-signal summaries with precise clips outperform verbose analytics.
- SSE is ideal for one-way real-time alerts — simpler and more efficient than WebSockets for this scenario.
- Clear architectural seams (typed schemas, defined endpoints) make AI components easy to swap without breaking the UI.
What’s Next
- Multi-Camera Orchestration: Support for multiple feeds with per-camera policies and prioritization.
- Edge Acceleration: On-premise inference to reduce latency and cost.
Log in or sign up for Devpost to join the conversation.