Inspiration

Traditional security systems operate on reactive thresholds: motion detected, alarm triggered. We asked a different question: what if security systems could understand intent?

Most access attempts are benign. Delivery personnel, residents, maintenance workers—these are legitimate visitors. But distinguishing intent from raw sensor data requires reasoning that goes beyond simple threshold detection. We built Sententia to demonstrate how large language models can bridge this gap, providing context-aware security that understands the who, what, and why of each interaction.

The name reflects this philosophy: "sententia" means judgment or opinion in Latin. True security requires judgment, not just reaction.

What it does

Sententia is a distributed security architecture that separates computational intelligence from edge operation. The system combines Gemini-powered intent classification with a live Raspberry Pi camera stream for real-time threat detection and response.

Brain Component (laptop)

  • Streamlit dashboard for real-time monitoring and control
  • LLM-powered intent classification using Google Gemini
  • Memory and context management for pattern recognition
  • Decision engine that translates classified intent into deterrent responses

Edge Component (Raspberry Pi)

  • Live camera module streaming JPEG frames over HTTP
  • HTTP server exposing three core endpoints
  • GET /frame.jpg streams live Raspberry Pi camera feed
  • POST /deterrent triggers context-aware responses (flash, siren, notification)
  • GET /status reports system health and operational state

The architecture allows the edge device to remain computationally minimal while the brain handles sophisticated inference. Deterrent responses adapt based on classified intent—distinguishing between legitimate access and actual threats using real-time visual data from the Pi camera.

How we built it

Technology Stack

  • Python (94.5%) for both brain and edge implementation
  • Streamlit for real-time dashboard and monitoring
  • Google Gemini API for LLM-powered reasoning
  • Raspberry Pi with camera module for live video capture
  • HTTP with JSON payloads for universal hardware compatibility
  • Shell scripts for deployment orchestration

Development Approach We deployed a fully functional Raspberry Pi edge server that captures live camera feed and exposes HTTP endpoints:

  1. Pi camera module streams JPEG frames via GET /frame.jpg
  2. Brain fetches frames and sends them to Gemini for intent classification
  3. Classified intent triggers deterrent responses via POST /deterrent
  4. System state is monitored via GET /status

The brain-edge communication is completely hardware-agnostic. The Pi serves frames; the brain provides intelligence and makes decisions. This separation of concerns enables rapid iteration on the AI layer without touching edge code.

Key Design Decision We chose stateless HTTP communication with JSON payloads. This maximizes compatibility while maintaining simplicity. The brain-edge contract is: edge provides live frames and accepts deterrent commands; brain provides intelligence and decision-making. The Pi camera module handles the heavy lifting of frame capture, and Gemini handles the heavy lifting of intent reasoning.

Challenges we ran into

Real-Time Frame Processing with Network Latency Streaming video from a Pi camera over HTTP introduces network delays and bandwidth constraints. We optimized by:

  • Requesting frames only when needed rather than continuous streaming
  • Compressing JPEG quality to reduce bandwidth while maintaining usability
  • Implementing local caching of intent classifications to avoid redundant API calls
  • Designing the system so edge operations continue even if the brain is momentarily unresponsive

Camera Integration on Raspberry Pi Setting up the Pi camera module to reliably stream frames required:

  • Proper camera configuration in raspi-config
  • Handling device permissions correctly
  • Ensuring consistent frame capture and JPEG encoding
  • Testing across different lighting conditions and distances

Intent Classification from Real Visual Data Moving from mock frames to actual camera feed introduced real-world challenges:

  • Varying lighting conditions affect image quality
  • Camera angle and distance impact what Gemini can classify
  • Need for robust error handling when frames are unclear or degraded
  • Balancing API call frequency with response latency

LLM Integration and Cost Calling Gemini for every frame would be expensive and slow. We addressed this through:

  • Intelligent frame sampling (not every frame needs classification)
  • Caching intent classifications for identical or similar frames
  • Batching analysis when possible
  • Designing fallback behavior when API latency is high

Accomplishments that we're proud of

  • Live Raspberry Pi camera integration streaming real video data to the brain for analysis
  • Working end-to-end system from Pi camera capture to Gemini intent classification to deterrent response
  • Functional Streamlit dashboard displaying real-time camera feed and system decisions
  • Optimized frame processing that handles network latency and bandwidth constraints intelligently
  • Production-ready architecture that scales from prototype to deployment
  • Clean brain-edge separation enabling independent development and testing
  • Rapid prototyping capability achieving full system demonstration with real hardware within hackathon timeframe

What we learned

Intent Recognition from Real Video is Hard Moving from mock frames to actual Pi camera feed revealed how context-dependent visual analysis is. Lighting, angle, distance, and speed all affect what an LLM can understand. Successful intent classification requires:

  • Multiple frames for context (not just a single snapshot)
  • Understanding of the physical environment
  • Feedback loops to improve over time
  • Human oversight for high-confidence decisions

Network-Aware Design is Essential Running an edge device over HTTP forced us to think carefully about bandwidth, latency, and reliability. We learned that assuming fast, reliable networks is dangerous—real-world deployments need graceful degradation.

The Pi Camera Module is Surprisingly Capable The Raspberry Pi camera produces sufficient image quality for LLM analysis even in modest lighting. This opened possibilities for deploying sophisticated AI reasoning on genuinely minimal hardware.

Separating Intelligence from Execution Scales By keeping the Pi simple (just capture and execute) and moving reasoning to the laptop/cloud, we created a system that can evolve the AI layer without touching edge code. This separation is powerful.

LLMs Excel at Contextual Reasoning Gemini's ability to reason about the context and intent behind visual scenes far exceeds traditional computer vision approaches. It understands semantics in ways that threshold-based detection cannot.

What's next for Sententia

  • Multi-Camera Deployment: Add multiple Pi cameras or expand to multiple Pi devices for comprehensive coverage
  • Adaptive Frame Sampling: Implement intelligent sampling based on motion detection to reduce API calls while maintaining responsiveness
  • Behavioral Learning: Build long-term patterns of visitor behavior to improve intent classification accuracy
  • Persistent Memory: Maintain context about regular visitors and flag unusual behavioral deviations
  • Thermal and Audio Integration: Add thermal imaging and audio analysis for richer intent understanding
  • Edge Intelligence: Offload some classification to the Pi using lightweight models to reduce latency
  • Feedback Loops: Implement mechanisms for the system to learn from misclassifications
  • Scalable Brain: Design the brain to manage multiple Pi edge devices simultaneously
  • Privacy-Preserving Deployment: Explore on-device processing for sensitive scenarios

Sententia demonstrates that intelligent security is achievable when you separate the reasoning layer (cloud/laptop with Gemini) from the execution layer (Pi with camera). Real hardware and real-time video make this more than a proof of concept—it's a blueprint for practical, intelligent edge security systems.

Built With

  • bash)
  • cloud
  • development
  • gemini
  • google
  • http/rest
  • languages:-python
  • llm
  • mock
  • pi
  • platforms:
  • raspberry
  • services
  • shell
  • streamlit
  • tools:
  • ui)
Share this project:

Updates