Gemini Integration Description
Guardian Angel AI is built entirely around Google’s Gemini multimodal ecosystem and was developed and optimized using Google AI Studio with the Gemini 3 Pro Preview Code Assistant for rapid prototyping, debugging, and architectural refinement.
At the core of the system is the Gemini Live Multimodal API (gemini-2.5-flash-native-audio-preview-12-2025), which powers continuous environmental awareness and real-time reasoning. The application streams raw PCM audio and live camera frames through a persistent WebSocket connection. This enables the Gemini to simultaneously interpret sound patterns, visual scenes, and environmental context. Using structured tool calls, the model updates the application’s safety status (CALM, CHANGING, URGENT) and generates adaptive spoken guidance while synchronously updating visual status indicators and advisory messages on the user interface.
To maintain uninterrupted perception, a custom “heartbeat” mechanism ensures Gemini regularly re-evaluates incoming visual data, preserving situational awareness even during user silence.
Recap Archive saves important incident data so users can review urgent moments later, even if they missed them when they happened. Guardian Angel AI integrates the Gemini 3 Flash Preview API (gemini-3-flash-preview). When urgent events are detected, short video clips are automatically captured and analyzed asynchronously. Gemini produces structured summaries, explanations, and preventative recommendations, which are stored in the Danger Recap system for later user review.
Throughout development, Google AI Studio and the Gemini 3 Pro Preview Code Assistant were used to implement the app and validate multimodal pipelines, optimize latency, and ensure production-grade reliability.
Together, the Gemini Live Multimodal API and Gemini 3 Flash Preview API enable Guardian Angel AI to function as a robust, intelligent safety companion that combines real-time perception, contextual reasoning, and persistent learning, directly aligning with the hackathon’s focus on advanced orchestration, innovation, and real-world impact.
Standard Setup:- Standard Video Feeds from (CCTV Cameras or Smart Glasses) and Audio Input and Output through speakers or earpods. (You can select them as camera and audio devices in Chrome audio and camera permissions). Or you can just use your device's camera and audio.

Built With
- gemini-2.5-flash-native-audio-preview-12-2025
- gemini-3-flash-preview-api
- gemini-3-pro-preview
- google-ai-studio
- google-gemini-live-api
- html
- html5-canvas-api
- indexeddb
- javascript
- mediadevices-api
- mediarecorder-api
- react
- tailwind-css
- typescript
- web-audio-api
- websocket


Log in or sign up for Devpost to join the conversation.