Threat Detection System + SuperSafe Monitoring — Project Summary

Inspiration

Security cameras and webcams are everywhere, but they often send video to the cloud, store raw footage on company servers, or lock you into proprietary ecosystems. We wanted to explore a different approach: AI-powered threat detection that stays privacy-first in the browser and responds with a decentralized backend. The idea was to use a webcam or security camera to process video locally, send only selected frames to an AI API for minimal metadata (e.g. “Person holding a knife” or “Unknown person near front door”), and then have an autonomous agent on Akash handle analysis, alerts, and escalation—so both the camera side and the response side avoid traditional centralized clouds.

What it does

SuperSafe Monitoring (Venice) turns a webcam into a real-time threat detector. Video is processed locally in the browser; a single frame is captured, compressed, and sent to the Venice API. The AI returns structured metadata: threat level (none | low | medium | high), summary, confidence, and suggested action. Detected threats appear in a Threat Activity Timeline; for medium or high levels, the app can speak an alert via Venice TTS. Raw video is never stored—only high-level event summaries (and optionally in localStorage for the session). Each detected threat is also sent to the Akash Threat Detection backend so it becomes a case in the central system.

Threat Detection System (Akash) runs entirely on Akash with AkashML for LLM inference. Every incoming threat (from Venice or manual report) is analyzed by the agent: it gets scenario summaries, recommended actions, and dispatch suggestions (e.g. police, medical). Area managers receive email and SMS; high-risk or repeated incidents trigger emergency tickets and optional webhook calls to emergency services. A React dashboard shows case history, danger zones, and emergency tickets. The agent deduplicates alerts (no spam when the same threat is reported repeatedly), and ticket logic runs only when multiple threats occur in the same place in a short window—so one timeline event = one case; tickets are a separate, later step.

Together: privacy-first detection in the browser plus decentralized, AI-powered response on Akash.

How we built it

SuperSafe (Venice) — front-end
React + Vite + TypeScript, Tailwind CSS. Camera access via getUserMedia, hidden <canvas> for frame capture. Frames are sent to Venice chat completions with a prompt that asks for JSON: threatLevel, summary, confidence, suggestedAction. We parse the response; medium/high threats go to the timeline and TTS queue. A bridge (sendThreatToAkash) POSTs each threat to the Akash backend with fixed room/building/contact (e.g. Room 411, Building ZOX), so every timeline event becomes a case in Case History. Timeline is persisted in localStorage; a “Clear timeline” option resets it.
Threat Detection backend (Akash) — Python
FastAPI app with SQLite (cases, contacts, emergency tickets, agent state). Each POST /threats is handled by an agent that: looks up similar past cases, calls AkashML (Llama 3.3 70B) for analysis and dispatch recommendations, checks for recent duplicate (same place + description in 60s) to avoid resending email/SMS, sends email (Resend) and SMS (Twilio) to the area manager, and for high risk notifies an emergency webhook. If multiple high-priority threats occur in the same place within a few minutes, the agent creates or updates an emergency ticket and sends a ticket payload to the webhook. Danger zones are marked using historical data and an LLM assessment. An autonomous agent loop runs in the background, writing a heartbeat to the DB for fault tolerance.
Dashboard
React (Vite) app built and served from the same backend. Shows stats (total cases, ongoing, danger zones), Danger Zones list, Emergency Tickets table, and Case History with filters. A sliding panel has the manual “Report Threat” form. Data is refreshed on an interval; a reset endpoint clears all data for fresh testing.
Deployment
Docker image: multi-stage build (Node for dashboard, Python for API). Akash SDL defines one service, exposes port 8000 as 80, and passes env vars (AKASHML_API_KEY, Resend, Twilio, emergency webhook). Deploy via Akash Console; the app runs fully on Akash with AkashML for inference.
Robustness
Venice: missing API key, camera errors, and API failures show clear messages; TTS is queued and non-blocking. Backend: dedup window to avoid duplicate notifications; 404 on optional endpoints (e.g. emergency-tickets) handled gracefully; /reset for clean state.

Challenges we ran into

Structured output from the vision model — Getting Venice to return only valid JSON (no markdown or extra text) required prompt tuning and a fallback parser so invalid responses default to “none” instead of crashing.
Camera and lifecycle — Stopping the media stream on unmount and avoiding state updates after “Stop Monitoring” needed careful useEffect cleanup and a cancelled flag to prevent leaks and stale results.
Unifying two systems — Keeping Venice privacy-first (no raw video storage) while still feeding a central system meant sending only minimal, per-threat payloads from the browser to Akash and designing the backend so “one timeline event = one case” with ticket logic as a separate, cluster-based step.
Akash deployment — Ensuring API routes (e.g. /emergency-tickets) were registered before any static/dashboard routes so they never returned 404, and documenting env vars and reset flow for judges.

Accomplishments that we're proud of

Privacy-first detection — A working threat-detection flow that never stores raw video and keeps only minimal event summaries in the browser, with a clear Privacy section and optional persistence in localStorage.
Full pipeline on Akash — Backend and agent run entirely on Akash; LLM calls go to AkashML only. No centralized cloud fallback; the agent is stateful and fault-tolerant across restarts.
Real-world actions — Email and SMS to managers, emergency webhooks, and automatic emergency tickets when multiple threats cluster in time and place, with AI-driven scenario summaries and dispatch recommendations.
End-to-end integration — From live camera → Venice AI → timeline and TTS in the browser, to Akash backend → case history, danger zones, and tickets, with a single dashboard and a reset path for demos.
Structured AI output — Reliable JSON from both the vision model (Venice) and the text model (AkashML) via prompt design and defensive parsing, so the UI and agent logic stay consistent.

What we learned

Vision and speech — Using Venice for both frame analysis and TTS simplified the stack. An audio queue for TTS and non-blocking failure handling kept the main detection flow reliable.
Decentralized backend — Running the response logic on Akash with AkashML showed how to keep inference and orchestration on the same decentralized network and how to design an agent that does real actions (notifications, tickets) with persistent state.
Privacy and minimal data — Deciding what not to store (no cloud DVR, no server-side video) made the architecture clear and made it easy to explain the privacy story while still enabling a powerful response system.
Lifecycle and integration — React hooks for camera and analysis, plus a thin bridge from Venice to Akash, gave a clean separation between “detect in browser” and “analyze and act on Akash.”

What's next

Multi-camera and device support — Support for selecting cameras or integrating IP/RTSP streams so one dashboard can monitor several feeds and still push each detection to Akash as a case.
Refined prompts and models — Iterate on Venice and AkashML prompts and try other models to improve accuracy and reduce false positives.
Notifications to external systems — In addition to email/SMS and emergency webhooks, add optional Telegram, Discord, or Slack alerts so users get instant notifications when the browser tab is closed.
Venice → Akash in production — Harden the bridge (retries, backoff) and allow configurable room/building/contact per camera or per deployment so multiple sites can use the same Akash backend.