Medclara: AI Medical ASR & Automated Clinical Documentation Platform


Inspiration

Every single day, brilliant clinicians are buried under a mountain of administrative paperwork. For every hour spent looking a patient in the eye, doctors spend nearly two hours staring at a computer screen typing charts. This invisible friction causes widespread industry burnout and pulls focus away from actual patient care.

We were inspired to build Medclara after witnessing healthcare professionals spend their nights completing electronic health records (EHRs). Furthermore, modern clinics are increasingly global, yet existing transcription tools fail to support regional dialects and multilingual patient populations. We set out to build an ambient, lightning-fast, and secure medical scribe that acts as a quiet co-pilot in the exam room — giving clinicians their time and lives back.


What It Does

Medclara-EMR is a secure, ambient AI clinical documentation platform that listens to natural doctor-patient conversations and automatically generates structured, template-ready medical reports in seconds.

  • Ambient Listening & Diarization — Captures multi-party audio, distinguishing smoothly between the clinician and the patient.
  • Multilingual Fluency — Instantly transcribes and translates across 90+ languages and dialects, standardizing the final output into professional clinical English or preferred localized formats.
  • 54+ Specialty Templates — Whether it's a standard SOAP note, a History & Physical (H&P) report, or specialized psychiatry or cardiology layouts, the system populates structured notes instantly.
  • Tangible Impact — Eliminates manual note-taking, cuts administrative burdens by 50%, and reclaims up to 2+ hours of clinician time every single day.

How We Built It

We engineered Medclara to prioritize performance, security, and compile-time reliability over generic "AI wrapper" architectures.

Layer Technology Purpose
Backend Go 1.24 + sqlc High-concurrency, type-safe database queries
AI Core Google Vertex AI (Gemini 2.5 Flash) Semantic text processing & low-latency summaries
Frontend Next.js Responsive SPA with dynamic Scribe Workspace
Database PostgreSQL (multi-tenant) Isolated clinic data across encounters, notes & transcripts

Architecture Diagram

Challenges We Ran Into

Non-Linear Medical Conversations

Doctors and patients do not speak in the structured order of a medical chart — they jump from symptoms to history, back to symptoms.

Solution: We avoided rigid regex-based extraction and engineered a Template-Agnostic Architecture using role-based semantic parsing via Gemini 2.5 Flash, allowing the AI to correctly map clinical details no matter when they were spoken.

Audio Codec Fragmentation

Different mobile and desktop browsers record audio in wildly contrasting container formats (e.g., Safari's WebM limitations vs. Chrome).

Solution: We built a flexible ingestion engine supporting WebM, MP3, WAV, and OGG formats seamlessly.


Accomplishments We're Proud Of

  • Seamless UX — An intuitive interface requiring exactly two clicks from a clinician: one to start recording, one to finalize the chart.
  • Production-Grade Architecture — Features 12-round bcrypt password hashing, rigorous role-based access control (RBAC), database transaction boundaries, and full API documentation.

What We Learned

Speed Beats Model Size in UX In healthcare workflows, latency is everything. While larger models might offer minor edge-case reasoning advantages, Gemini 2.5 Flash's rapid response time is what makes the application feel like magic to a busy clinician.

Security Must Be Native, Not an Afterthought When dealing with health data, you cannot simply slap auth on at the end. Architecting multi-tenant database isolation from day one is the only viable way to build a platform capable of handling sensitive patient information safely.


What's Next

We are just scratching the surface of what Medclara can achieve. Our upcoming roadmap includes:

  1. HL7 / FHIR Integration — Direct interoperability pipelines to push generated clinical notes into industry-standard EHR networks like Epic and Cerner.
  2. Live WebSocket Streaming — Moving from chunked post-processing to real-time, live-streaming audio transcription directly on the UI screen as the patient speaks.
  3. Automated Medical Coding — Using our clinical entity extraction layer to suggest accurate, context-aware ICD-10 (diagnosis) and CPT (procedure) codes automatically, streamlining the billing cycle for independent practices.

Links

🔗 GitHub Repository  |  📄 API Documentation

📄 More About Architecture

Built With

Share this project:

Updates