Inspiration
Modern medicine has achieved remarkable progress—curing once-incurable diseases, expanding global healthcare access, and digitizing consultations into a few clicks. Yet, one group remains underserved: our senior citizens. Despite easier access to doctors and better treatments, many older adults still struggle to articulate symptoms, understand medical terminology, and manage their own health records across providers. HearMeOut was created to bridge this communication gap—making healthcare more accessible, empathetic, and effective for all.
What it does
HearMeOut is an AI-driven accessibility service that acts as a real-time translator between doctors and patients.
The system records both speakers, identifies who is talking, and uses Open-AI Whisper to transcribe audio of the conversation. We then format the input text and pass it into the ChatGPT 4.1 Nano Model, which dynamically translates each message into language the other party can understand—generating simplifications for medical terminology for patients and clarifying vague or emotional descriptions for doctors.
During the conversation, key medical phrases (symptoms, diagnoses, medications, and treatments) are automatically highlighted on screen in real-time along with the translations.
After the visit, HearMeOut automatically generates two outputs: a patient-friendly After Visit Summary and a doctor-formatted SOAP note/EHR (Electronic Health Record) entry, ensuring both sides leave with clarity and confidence. We also modify our transcript to highlight keywords relevant to patient diagnosis and treatment. Both EHR generation and keyword generation are done using Gemini 2.5 Flash due to its proficiency in analytical thinking.
How we built it
We built HearMeOut on a scalable, modular architecture designed for accessibility and medical precision.
Speech Layer: Audio is captured and transcribed using GPTWhisper, which delivers high-accuracy, low-latency speech recognition optimized for clinical terminology. This enables real-time transcription of both doctor and patient voices with reliable speaker separation.
Language Layer: Transcribed text is translated to English through the Google Translate API, and then processed through Google Gemini 2.5 Flash, which classifies speakers and formats the input text. The text is then passed to Open AI ChatGPT 4.1 Nano which translates medical jargon into clear, patient-friendly language and translates patient information for doctors in clinical terms. Once the consultation is finished, the transcript is then received by Open AI ChatGPT 4.0 which generates a consultation transcript annotated with keywords related to patient diagnosis and treatment, a meeting summary that is ready to email to the patient, and SOAP Notes and EHRs that are conveniently available for doctors to review.
Backend Logic: We orchestrated the entire NLP pipeline with N8N, creating modular prompt flows for speaker classification, translation summarization, and automatic generation of After-Visit Summaries and SOAP notes.
Data Layer (Snowflake): Patient data and active session transcripts are securely stored in Snowflake, allowing for efficient, real-time querying and retrieval of ongoing doctor–patient conversations. This structure enables longitudinal data tracking, maintaining a continuous medical record that evolves over the course of the consultation.
Frontend: Built in Next.js, the interface displays live transcription, keyword highlighting, and speaker identification—ensuring clarity, accessibility, and trust throughout the consultation.
Privacy & Compliance: Every component of HearMeOut was designed with HIPAA-aligned data practices, emphasizing encryption, anonymization, and least-privilege access. All patient transcripts and health data stored in Snowflake are secured with end-to-end encryption and role-based access control, ensuring that sensitive information remains private and auditable.
Challenges we ran into
One of our biggest technical challenges was building a dual-speaker transcription pipeline that could operate in real time. Accurately distinguishing between doctor and patient voices required fine-tuning diarization models to handle overlapping speech and environmental noise.
Beyond transcription, real-time translation proved equally difficult. Ensuring that GPTWhisper outputs were instantly processed through our translation layer—with minimal delay—required us to design an optimized buffering and streaming system. We struggled to balance latency and accuracy, as even small delays disrupted the conversational flow.
Another major challenge was synchronizing transcription, translation, and on-screen rendering. Each step (speech recognition, text generation, and NLP processing) added milliseconds of delay that compounded over time. Achieving a smooth, responsive interface meant parallelizing tasks, caching results, and carefully managing API response times.
Finally, building prompt pipelines that preserved medical accuracy while simplifying language for patients demanded continuous iteration. Translating complex medical jargon without losing nuance was one of the hardest—and most rewarding—parts of the project.
Accomplishments that we're proud of
We’re incredibly proud of building a fully functional prototype that can record doctor–patient conversations, classify speakers in real time, highlight key medical terms, and automatically generate both EHR-ready notes and patient-friendly summaries.
For many of us, this was our first time working with tools like N8N, yet within just 36 hours, we learned how to orchestrate complex prompt flows, automate multi-step NLP processing, and integrate multiple APIs into a seamless pipeline. Overcoming that learning curve and seeing our system run smoothly in real time was one of our proudest moments.
Beyond the technical milestones, we’re proud that HearMeOut represents more than just innovation—it embodies empathy. Watching our demo generate clear, accessible summaries that bridge the communication gap between doctors and patients reminded us exactly why we built it: to make understanding a part of care.
What we learned
This project gave us a deep appreciation for how AI and automation can enhance accessibility and patient care. We realized that improving doctor–patient communication goes far beyond transcription—it requires empathy, precision, and an understanding of how people process medical information under stress.
We also gained insight into the many protocols that shape real clinical consultations, from diagnostic questioning to terminology use, and how technology can be applied to make those interactions clearer and more human. Experiencing firsthand how AI can optimize communication workflows reinforced our belief in its power to create meaningful, tangible impact on quality of life.
On the technical side, we learned how to build adaptive, automated workflows that connect front-end interfaces, AI-based language processing, and database management. We learned to pivot ideas based on software and time constraints, prioritizing features that delivered the most immediate value. Ultimately, we came away not just with new technical skills—but with a stronger sense of how thoughtful engineering can make healthcare more inclusive, efficient, and compassionate.
What's next for HearMeOut
Looking ahead, we plan to expand HearMeOut into a more inclusive and intelligent healthcare communication platform. Our goals include:
Broadened Multilingual Support: Extending translation capabilities across multiple languages to make medical conversations universally accessible.
Accessibility for All: Expanding the platform to better serve individuals with disabilities and those with low health literacy, ensuring that no patient is left behind in understanding their care.
Multimodal Translation Processing: Integrating additional input and output modalities—such as visual aids and adaptive text formatting—to enhance comprehension and flexibility across diverse use cases.
Expanded Patient Management: Building tools for longitudinal health tracking, allowing patients and doctors to view ongoing summaries, medications, and care history directly within the app.
Our vision is to make HearMeOut a trusted bridge between patients and providers—one that speaks every language, understands every voice, and empowers every person to take charge of their health.
Built With
- fhir-ehr
- google-gemini
- google-translate-api
- n8n
- next.js
- openai-gpt
- openai-whisper
- snowflake
- tailwind

Log in or sign up for Devpost to join the conversation.