Inspiration

We wanted to build a real-time AI system that helps security operators detect and understand incidents instantly, with explainable alerts and instant video evidence — all without complex setup or heavy infrastructure.

What It Does

  • Detects incidents in real time from live camera feeds using Google Gemini.
  • Slices the stream into short MP4 clips and attaches the relevant snippet to each detected event.
  • Streams events to the dashboard via Server-Sent Events (SSE) with concise, explainable summaries.
  • Allows operators to preview clips instantly and review recent alerts.
  • Provides backend APIs to start/stop monitoring and view system status.

How We Built It

  • Backend: FastAPI service with lifecycle management, CORS, and a video pipeline that chunks streams into video_chunks/. Event detection is handled by Gemini, with optional persistence to PostgreSQL when DATABASE_URL is configured.
  • Realtime: SSE endpoint streams new events to clients without polling.
  • Frontend: Next.js (App Router) app that renders the live monitoring UI, subscribes to SSE, and plays event-specific clips.
  • Data Model: Each event includes timestamp, event code, description, explainability text, and clip URL.
  • Operations: Simple endpoints — /start, /stop, /status, /events, /events/id/{event_id}, and /video?filepath=....
  • Latency Optimization: We target sub-5s perceived delay by overlapping video chunking and analysis. If chunk size = L and analysis time = A, total delay ≈ max(L, A) + δ (network/IO). We tune L to balance quality, latency, and cost.

Challenges

  • Reducing latency while maintaining consistent video quality and timely event detection.

Accomplishments

  • Achieved a clean end-to-end flow from camera to explainable alert with one-click clip playback.
  • Implemented smooth, real-time updates using SSE with client-side deduplication.
  • Designed a graceful service lifecycle with safe startup/shutdown and optional database persistence.

What We Learned

  • In operator workflows, clarity beats volume — short, high-signal summaries with precise clips outperform verbose analytics.
  • SSE is ideal for one-way real-time alerts — simpler and more efficient than WebSockets for this scenario.
  • Clear architectural seams (typed schemas, defined endpoints) make AI components easy to swap without breaking the UI.

What’s Next

  • Multi-Camera Orchestration: Support for multiple feeds with per-camera policies and prioritization.
  • Edge Acceleration: On-premise inference to reduce latency and cost.

Built With

Share this project:

Updates