🤰 MaaSwara
A Mother’s Voice — Multimodal AI Triage Engine for Low-Resource Maternal Healthcare
🚀 From Idea to Impact
The Problem — The Delay That Kills
Every two minutes, a woman dies during pregnancy or childbirth.
- 80%+ deaths are preventable
- 94% occur in low-resource regions
- 50%+ caused by “Type 1 Delay” — failure to recognize danger signs
A mother experiences:
- Severe headache
- Blurred vision
- Swelling
She assumes it’s normal.
It is preeclampsia — and without intervention, it can kill her within hours.
Healthcare systems assume:
- Literacy
- Awareness
- Internet access
She has none.
The Solution — MaaSwara
Voice → AI → Clinical Decision → Real-Time Intervention
MaaSwara is a voice-first, multilingual AI triage system that:
- Listens in any language
- Understands symptoms clinically
- Classifies severity
- Acts instantly
No typing. No literacy. No delay.
🧠 What It Does
Intent → Action Mapping
| Intent | Example Input | System Output |
|---|---|---|
| Normal symptom | “Thoda dard hai” | GREEN → reassurance |
| Critical symptom | “Severe headache + blurred vision” | RED → clinic alert |
| Fetal concern | “Baby not moving” | YELLOW → urgent follow-up |
| Emergency | “Heavy bleeding” | RED → override → dispatch |
🌐 Accessibility Layers
| Mode | Environment | Description |
|---|---|---|
| 🎙️ Voice | Low literacy | Real-time speech interaction |
| 💬 Web Chat | Standard users | UI-based interaction |
| 📲 Telegram | 2G networks | Ultra low-bandwidth input |
All inputs → Single Unified Triage Engine
🏗️ Architecture
System Overview
┌────────────────────────────────────────────┐
│ Client Interfaces │
│ 🎙️ Voice UI 💬 Web Chat 📲 Telegram Bot │
│ (Web Audio) (React UI) (2G Input) │
└─────────────────┬──────────────────────────┘
│ HTTP / WebSocket / Webhook
┌─────────────────▼──────────────────────────┐
│ Next.js Edge Backend │
│ /api/chat /api/live-token │
│ /api/alerts /api/telegram/webhook │
└───────────────┬───────────────┬────────────┘
│ │
┌───────▼───────┐ ┌─────▼────────────┐
│ Gemini-2.5 │ │ Deterministic │
│ Flash AI │ │ Safety Engine │
│ (Primary) │ │ (WHO Rules) │
└───────┬───────┘ └─────┬────────────┘
│ │
└───────┬───────┘
▼
┌────────────────────────┐
│ Severity Resolver │
│ (Override Logic) │
└──────────┬─────────────┘
│
┌──────────▼─────────────┐
│ Supabase Database │
│ (alerts + routing) │
└──────────┬─────────────┘
│
┌──────────▼─────────────┐
│ Geolocation Routing │
│ (Haversine Distance) │
└──────────┬─────────────┘
│
┌──────────▼─────────────┐
│ Realtime Alerts │
│ (WebSockets) │
└──────────┬─────────────┘
│
┌──────────▼─────────────┐
│ Provider Dashboard │
│ (Clinic Interface) │
└────────────────────────┘
⚙️ Processing Pipeline
User Input
│
├──► AI Engine (Gemini-2.5-Flash)
│ │
│ └──► summary_en (proxy output)
│
├──► Deterministic Safety Engine
│ └──► regex match (WHO danger signs)
│
└──► Severity Resolver
│
└──► RED → Alert Dispatch
🔁 Agent-Level Flow
User Input
│
├──► triage_engine → classifies severity (LLM)
├──► safety_engine → validates danger signs (regex)
└──► resolver_engine → computes final severity
If RED:
└──► routing_engine → finds nearest clinic
└──► alert_engine → pushes realtime alert
🧠 Core Breakthrough — AI Safety System
Problem: LLM Hallucination = Risk
AI can:
- Miss critical symptoms
- Misclassify severity
In healthcare → unacceptable.
Solution: Dual-Layer Decision Engine
| Layer | Role |
|---|---|
| AI (LLM) | Understanding + translation |
| Deterministic Engine | Safety enforcement |
📐 Mathematical Model
Let:
- ( S ) = raw input
- ( E ) = English proxy summary
- ( \mathcal{T} = {T_1, T_2, ..., T_{11}} ) = danger signs
Trigger condition:
$$ \exists T_i \in \mathcal{T} \mid \text{match}(T_i, S) \lor \text{match}(T_i, E) $$
Final decision:
$$ \text{Final Severity} = \max(\text{LLM Output}, \text{Deterministic Override}) $$
🌍 Proxy Language System
Instead of writing rules for every language:
- AI generates summary_en
- System scans only English
Result:
- 100+ language support
- No translation overhead
- Full safety guarantee
🔄 System Workflows
🎙️ Voice Workflow
Audio Capture (PCM16)
│
├──► WebSocket Stream
│
├──► Gemini Live Processing
│
├──► Safety Override Check
│
└──► Severity Output → Alert if RED
📲 Telegram Workflow
User Message
│
├──► Webhook (Next.js API)
│
├──► AI Processing
│
├──► Markdown Stripped
│
└──► Response Delivered
🚨 Alert Dispatch Workflow
RED Detected
│
├──► Capture GPS
├──► Haversine Distance Calculation
├──► Assign nearest clinic
├──► Store in Supabase
└──► Push via WebSocket → Dashboard
🛠️ Tech Stack
| Layer | Technology |
|---|---|
| Frontend | React + Next.js |
| AI Engine | Gemini-2.5-Flash |
| Voice | Web-Audio-API + AudioWorklet |
| Backend | Next.js Edge APIs |
| Database | Supabase PostgreSQL |
| Realtime | WebSockets |
| Routing | Haversine Algorithm |
| Hosting | Vercel |
⚔️ Challenges
- LLM hallucination risk → solved via deterministic override
- Multilingual scaling → solved via proxy English system
- Web Audio constraints → manual activation fix
- Telegram API failures → raw text fallback
- Real-time reliability → WebSocket pipeline
🏆 Accomplishments
- Built a real-time AI healthcare system
- Achieved 100+ language support
- Designed zero-retention voice pipeline
- Created end-to-end emergency workflow
- Solved AI safety in critical environments
- Delivered a fully working system
📚 What We Learned
- AI must be controlled, not trusted blindly
- Real-world impact requires system-level thinking
- Accessibility > complexity
- Constraints drive innovation
- Voice is the most powerful interface in low-resource systems
🚀 What's Next
- EHR / DHIS2 integration
- Biometric device integration
- WhatsApp & IVR support
- Automated ambulance dispatch
- Government-level deployment
🌍 Impact
MaaSwara transforms maternal healthcare from:
❌ Reactive
❌ Delayed
❌ Literacy-dependent
To:
✅ Real-time
✅ Voice-first
✅ Accessible
✅ Life-saving
✨ Final Statement
"Listen to the mother. Override the hallucination. Save the life."
Built With
- gemini-2.5-flash
- next.js
- postgresql
- react
- supabase
- typescript
- vercel
- web-audio-api
- websockets


Log in or sign up for Devpost to join the conversation.