Inspiration

466 million people worldwide have disabling hearing loss. For deaf and hard-of-hearing patients, every doctor's visit is a communication crisis — medical jargon flies past, critical instructions get lost, and life-altering diagnoses are delivered in language patients can't fully access. Interpreters are scarce, expensive, and often unavailable for urgent visits. We built HealthBridge because no one should leave a doctor's office confused about their own health.

What it does

HealthBridge is a real-time communication bridge between deaf/hard-of-hearing patients and healthcare providers. It works in both directions simultaneously:

  • Patient → Doctor: ASL signs are captured via webcam, recognized by Gemini 3 Flash, and displayed as text on the doctor's dashboard
  • Doctor → Patient: Spoken medical terminology is transcribed and instantly simplified into plain language the patient can understand
  • Medical Object Scanner: Patients can photograph pills, devices, or wounds for AI-powered identification and plain-language explanation
  • Patient History Query: Doctors ask natural language questions about patient records and get synthesized answers from the full clinical history

HealthBridge is a communication tool — it translates, it doesn't diagnose.

How we built it

  • Frontend: Next.js 14 (App Router) + TypeScript + Tailwind CSS with a glassmorphism design language
  • AI Engine: Multi-model Gemini 3 architecture — Pro for reasoning-heavy tasks (patient history at HIGH thinking, object triage at MEDIUM, jargon detection at LOW), Flash for real-time ASL recognition (LOW thinking, HIGH media resolution), and Gemini 2.5 Flash Native Audio for bidirectional streaming via the Live API
  • Real-time Video: LiveKit (WebRTC) for low-latency video sessions between patient and doctor views
  • Gemini 3 Features: Four distinct Thinking Levels tuned per-task, Media Resolution control for vision pipelines, and Thought Signatures for stateful reasoning across multi-step function calls
  • Deployment: Vercel with automatic CI/CD from GitHub

Challenges we ran into

The biggest challenge was balancing accuracy with speed across different parts of the pipeline. Patient history synthesis demands deep reasoning (HIGH thinking) while ASL recognition needs sub-second latency (LOW thinking). Medical object triage sits in between — accurate enough to identify medications correctly, fast enough to feel responsive (MEDIUM thinking). Gemini 3's Thinking Levels let us optimize each pipeline independently rather than choosing one tradeoff for the whole application.

We also discovered that the Gemini Live API (BidiGenerateContent) isn't available for Gemini 3 models, which required an informed architectural decision to use Gemini 2.5 Flash Native Audio for that specific pipeline while keeping all reasoning tasks on Gemini 3.

Accomplishments that we're proud of

  • True bidirectional communication — patient signs, doctor sees text; doctor speaks, patient sees simplified text
  • Intentional multi-model architecture with four distinct thinking levels matched to task requirements
  • A design that doesn't look like a hackathon project — glassmorphism UI with motion and depth that feels like a real product
  • Accessibility-first design for the deaf/HoH community, with visual indicators replacing audio cues throughout

What we learned

Gemini 3's Thinking Levels are a game-changer for applications with mixed latency requirements. We use four different configurations across the app — HIGH for patient history queries that need deep reasoning, MEDIUM for medical object triage that balances accuracy and speed, and LOW for both jargon detection and ASL recognition where real-time responsiveness is critical. Being able to dial reasoning up or down per-request means you can have both accuracy AND speed in the same application.

Media Resolution control was equally important — ASL recognition requires HIGH resolution to distinguish subtle finger positions, while medical object triage works well at MEDIUM resolution, saving tokens without sacrificing accuracy.

What's next for HealthBridge

  • Integration with hospital EHR systems for real patient records
  • Support for additional sign languages (BSL, LSF, JSL)
  • On-device ASL recognition for offline use in areas with poor connectivity

Built With

  • gemini-3-flash
  • gemini-3-pro
  • google-gemini-3
  • livekit
  • next.js
  • tailwind-css
  • typescript.
  • vercel
  • webrtc
Share this project:

Updates