Lazarus - AI Medical Guardian
Inspiration
Emergency 911 operators are the first line of defense, often managing chaos with just their ears. But they are human, and under extreme pressure, subtle diagnostic cues can slip through the cracks—the specific gasping sound of agonal breathing in cardiac arrest, or the slight unilateral facial drooping of an onset stroke.
We were inspired to build Lazarus—named after the biblical figure raised from the dead—as an always-on "Guardian Angel." It’s a silent partner that watches and listens alongside the human operator, tapping them on the shoulder only when it detects a life-threatening emergency that requires immediate intervention.
What it does
Lazarus is a real-time, multimodal intelligence engine for emergency response.
- Live Monitoring: It ingests continuous streams of Audio (PCM) and Video (Frames) from the caller's device.
- Multimodal Diagnosis: Using Gemini 3 Pro, it correlates visual cues (e.g., cyanosis/blue skin) with audio cues (e.g., confused speech) to diagnose conditions like Stroke, Cardiac Arrest, and Shock.
- Immediate Alerting: When a threat is detected, it triggers a "Red Alert" on the operator's HUD within 500ms, displaying the diagnosis and confidence level.
- CPR Assistance: The "Rhythm of Life" protocol listens to chest compressions during CPR and provides real-time feedback (PUSH FASTER / PUSH SLOWER) to ensure effective resuscitation.
- Evidence Review: The "Phantom Replay" system captures a rolling 5-second buffer, allowing operators to instantly replay the exact moment the AI flagged an issue.
How we built it
We architected Lazarus on the "Iron Triangle" of Speed, Synergy, and Safety.
- The AI Core (The Brain): We utilized Google's Gemini 3 Pro (Multimodal Live API). Unlike older pipelines that required transcribing audio to text first, Gemini processes raw audio waveforms and video pixels simultaneously. This allows it to "hear" the texture of a breath while "seeing" the patient's face.
- The Frontend (The Eyes): Built with React 19 and Tailwind CSS. We utilized the
MediaStreamAPI for low-level access to webcam and microphone data. The interface is designed as a "Terminator-style" HUD using SVG overlays and CSS animations to keep the operator focused. - The Backend (The Nervous System): A Python FastAPI server acts as the high-speed bridge. It manages WebSocket connections, buffering binary media chunks and streaming them asynchronously to the Gemini session.
Challenges we ran into
- The Latency Budget: In medical emergencies, seconds equal neurons. Our goal was a total system latency under 500ms. $$ L_{total} = T_{capture} + T_{network} + T_{inference} + T_{render} \le 500ms $$ We had to move away from base64 encoding for audio and implement binary WebSocket transmission to meet this target.
- Prompt Engineering for strict JSON: Getting a large language model to strictly output machine-readable JSON without conversational filler ("Here is the diagnosis...") was difficult. We iterated heavily on the System Instructions to enforce a strict schema.
- Audio-Visual Synchronization: ensuring that the "slurred speech" audio chunk aligned perfectly with the "facial droop" video frame required careful timestamp management in our buffering logic.
Accomplishments that we're proud of
- The "Rhythm of Life" Algorithm: We successfully implemented a feedback loop where the AI listens to the sound of chest compressions and estimates the BPM. If the operator pushes too slow ($< 100$ BPM), the HUD visually pulses red to prompt them to speed up.
- Phantom Replay: Building a rolling in-memory video buffer in the browser without memory leaks was a technical win that adds immense value to the user experience.
- Zero-UI Interaction: The system requires no typing or clicking from the operator; it is purely passive until it needs to be active.
What we learned
- Multimodal is Essential: Text-only AI cannot hear the difference between a wheeze and a stridor, nor see the difference between a shadow and cyanosis. Multimodal models are a requirement, not a luxury, in MedTech.
- Latency is User Experience: In a high-stress environment, even a 1-second delay in feedback can break the operator's flow. Optimizing the network layer was just as important as the AI model selection.
What's next for Lazarus
- CAD Integration: Connecting Lazarus directly into existing Computer Aided Dispatch systems used by 911 centers.
- Expanded Diagnostics: Adding detection for Seizures (movement patterns) and Choking (specific audio signatures).
- Automated PCRs: Using the session context to automatically generate Patient Care Reports (PCRs) after the call ends, saving paramedics hours of paperwork.
Built With
- googlegemini
- python
- react
- tailwind
- typescript
- webaudioapi
- websockets
Log in or sign up for Devpost to join the conversation.