Inspiration
- 25M+ patients in the U.S. are Limited English Proficient (LEP), leading to 3× higher rates of adverse medical events
- Existing tools translate words, not meaning, missing cultural context in symptom descriptions
- Real-world examples inspired us: “A cat sitting on my chest” → angina (Vietnamese metaphor) “Fire in my stomach” → peptic ulcer disease (Hindi expression)
- We saw a gap: language translation ≠ understanding
- We asked: What if AI could translate cultural expression into clinical insight?
What it does
- A multi-agent cultural intelligence platform for healthcare
- Converts patient speech → culturally-aware clinical insights Key capabilities:
- Speech-first symptom input in native languages
- Cultural metaphor → medical meaning mapping
- ICD-10 codes + recommended screenings for doctors
- Food photo analysis → culturally relevant diet plans
- Voice-based care instructions for families in their language
- Care circle sharing across family members Two views: Patient view: fully localized, voice-first experience Doctor view: clean clinical dashboard with actionable insights
How we built it
1) Frontend:
- React + Vite + Tailwind
- Multilingual UI (10 languages, native scripts)
- SpeechRecognition API for on-device transcription 2) Backend
- Node.js + Express
- MongoDB Atlas for sessions, patients, doctors, feedback 3) AI Stack
- Gemma 4 31B → cultural symptom reasoning (RAG)
- Gemini 2.5 Flash → translation + vision (food analysis)
- Claude Sonnet → family assistant + mental wellness 4) ElevenLabs → multilingual voice synthesis 5) Agent Architecture (Fetch.ai Agentverse)
- Cultural NLP Agent
- Dietary Agent
- Voice Agent
- Orchestrator Agent (routes all queries) 6) Other integrations
- Cloudinary → food image pipeline
- Auth0 → secure doctor authentication
Challenges we ran into
1) Cultural accuracy
- AI alone cannot infer clinical meaning → required building a validated cultural knowledge base 2) Model routing
- One model couldn’t handle everything → had to split tasks across Gemma, Gemini, Claude 3) Multilingual UX
- Supporting native scripts (Hindi, Arabic, etc.) correctly across UI 4) Real-time synchronization
- Keeping patient + doctor views consistent across sessions 5) Balancing complexity vs usability
- Avoiding overwhelming doctors with technical outputs
Accomplishments that we're proud of
- Built a full-stack, production-style system, not just a prototype
- Successfully mapped cultural expressions → clinical diagnoses
- Designed a multi-agent architecture that is reusable beyond this app
- Delivered end-to-end multilingual experience (text + voice)
- Created a solution with real-world healthcare impact potential
What we learned
- Cultural intelligence is as important as language translation
- Model specialization matters — right model for the right task
- Speech-first design is critical for accessibility
- Good UX ≠ more features — clarity matters more than complexity
- AI systems need human-grounded data (knowledge bases) to be reliable
What's next for Voice-Of-Home
- Native mobile app with true on-device speech processing
- Expand cultural knowledge base (100+ conditions, 50+ languages)
- Real-time sync using MongoDB Change Streams
- Deploy vector search for better cultural expression matching
- Pilot with hospitals + pursue HIPAA compliance
- Open the agent layer as a public healthcare infrastructure API
Built With
- asi
- browser
- claude
- cloudinary
- css
- elevenlabs
- express.js
- fetch.ai
- gemini
- gemma
- github
- html
- javascript
- mongodb
- node.js
- python
- rag
- react
- rest
- speechrecognition
- tailwind
Log in or sign up for Devpost to join the conversation.