Inspiration

Growing up, we've had to sit in waiting rooms and exam rooms, expected to be translators for our parents with limited English, navigating a healthcare system that wasn't built for them. We were kids trying our best with our own limited Korean, working with our parents to understand a diagnosis neither of us fully understood. But the problem was never really about us. It was about what our parents experienced: the helplessness of not being able to describe their own symptoms accurately, of nodding along to discharge instructions they didn't understand, of leaving a hospital appointment less informed than when they arrived. It was about an adult, fully capable in their native language, having to try to advocate for themselves in a language they were still learning, in a room full of terminology that even native English speakers are unfamiliar with.

It turns out these moments play out thousands of times a day. Over 25 million people in the US have limited English proficiency (LEP). When they enter the healthcare system, they face a documented disparity: longer hospital stays, higher complication rates, and twice the risk of adverse events compared to English-speaking patients. The root cause isn't access, but communication.

The financial cost is just as staggering. Under Title VI of the Civil Rights Act, any hospital receiving federal funding is legally required to provide interpreter services at no cost to the patient. Most comply through a patchwork of on-call human interpreters, third-party phone services, and video remote interpreting contracts that cost hospitals an estimated $45-$150 per hour. For large health systems serving diverse urban populations, that adds up to millions of dollars annually.

Existing solutions are broken. On-call interpreter services average a 19-minute wait, generic translation apps miss medical nuance, and family members, especially children, are routinely pressed into service as informal translators for conversations they should never have to carry.

We built Entune because the technology to do this right finally exists. Better communication leads to fewer mistakes, shorter stays, and lower readmission rates. Entune doesn't just serve the patient. It serves the institution.

What it does

Entune is a real-time, bilingual medical translation platform designed to be culturally aware. It works across two devices: one for the provider, one for the patient, during a live consultation. A provider creates a session and shares a 6-digit join code. The patient enters the code on their device to connect. From there, Entune handles everything in between.

Live bilingual transcript: every spoken turn is transcribed and translated simultaneously. The provider sees clinically precise translations; the patient sees simplified, jargon-free versions. Both screens update in real time.

Cultural health flags: Entune detects culture-bound health concepts that don't translate directly into Western clinical language. For example, terms like nervios, a Latin American idiom for anxiety and somatic distress, or hwa-byung, a Korean suppressed anger syndrome. It then surfaces clinical context for the provider inline, without interrupting the conversation.

Jargon simplification: Medical terminology in the provider's speech is automatically simplified in the patient-facing translation. The patient hears "high blood pressure," not "hypertension."

Dual bilingual reports: At the end of every visit, Entune generates two completely different documents tailored to each audience. The provider receives a SOAP note ready to drop into their EHR. The patient receives a plain-language summary with medications, follow-up instructions, and warning signs, each in their own language, downloadable as a PDF.

Visit memory: Providers sign in with Google OAuth and all visits are saved. A dashboard shows visit history, and an AI chat lets providers ask questions about past visits. The AI answers strictly from documented visit history.

How we built it

Entune is built as a Next.js 15 application using the App Router and TypeScript, deployed on Vercel, with Supabase handling Postgres storage, real-time transcript sync between devices, and Google OAuth authentication.

The real-time transcription pipeline is powered by Deepgram's streaming speech-to-text API with automatic language detection, which listens to both speakers simultaneously and returns transcribed text with low enough latency to feel live. Each transcribed turn is then passed to the Claude API, which handles translation into the target language, jargon simplification in the patient-facing direction, cultural health concept detection, SOAP note generation, and the visit history chat assistant.

We deliberately split these responsibilities: Deepgram for what it does best (fast, accurate speech recognition), and Claude for what it does best (medically nuanced language understanding, cultural context, and clinical summarization). The translation pipeline sends each utterance to Claude with a system prompt that preserves medical precision for providers while simplifying jargon for patients. Cultural health concepts are matched against a curated glossary and Claude's own training data, then surfaced as contextual flags with clinical explanations and screening recommendations.

The frontend is built with Tailwind CSS and shadcn/ui, with dark mode support and responsive design. Two of our team members focused entirely on the interface: designing a dual-speaker transcript view, live alert sidebar, and the dual-report system. The other two members architected the AI pipeline and real-time infrastructure. React PDF handles generation of downloadable provider and patient report PDFs.

The entire stack was chosen for speed and reliability under hackathon conditions: Vercel deployments gave us instant previews, Supabase got us a working backend with real-time subscriptions without standing up infrastructure, and the Deepgram + Claude combination gave us a production-quality AI pipeline in hours rather than days.

Challenges we ran into

One of our trickiest technical challenges was handling code-switching: when a patient shifts mid-sentence from English to another language, the way real bilingual speakers naturally do. Deepgram's streaming transcription is configured for a primary language, and mid-sentence switches would either get transcribed incorrectly or silently dropped. We had to build a detection layer on top of the raw transcript that could identify language shifts in real time and route those segments correctly before passing them to Claude for translation.

On the frontend, keeping the live transcript stable while new turns streamed in proved harder than expected. Getting transcript entries to appear simultaneously on both the provider's and patient's screens required careful orchestration of Supabase real-time subscriptions, optimistic UI updates, and auto-scrolling behavior. Appending new turns to the DOM caused the view to jump constantly. We had to carefully manage scroll position, anchoring the view to the bottom only when the user was already there, and preserving their position if they had scrolled up to review an earlier turn.

The cultural flag detection required a lot of prompt engineering to get right. Our first attempts flagged far too aggressively, treating any non-English word or unfamiliar term as a potential culture-bound syndrome. We also had to solve for dual-audience translation: the same utterance needs two completely different outputs: clinically precise for the provider, simplified for the patient. Balancing accuracy with accessibility in a single API call, while also getting Claude to return consistently structured JSON with proper markdown for the SOAP notes, took more iteration than any other part of the pipeline.

Accomplishments that we're proud of

We're proud that Entune feels like a real product and not a hackathon demo. The live bilingual transcript, cultural flagging, dual-report generation, and visit memory all work end-to-end in a single cohesive experience. Our interface is polished enough that you could put it in front of an actual clinic volunteer and they'd know exactly what to do.

The cultural health flag feature is the thing we're most proud of technically. When a Korean patient says "화병인 것 같아요," both the translation and a clinical context card appear, explaining the cultural syndrome, what to screen for, and safety considerations. That is the kind of context that gets completely lost with Google Translate. Getting Claude to surface it reliably, without flagging every foreign word, stumped us for a bit. We solved it through careful prompt design and a curated cultural glossary, and the result is something that doesn't exist in any current medical translation tool.

The dual-report system is something we're equally proud of. One visit produces two completely different documents tailored to each audience, a SOAP note the provider can drop into their EHR, and a plain-language summary the patient can actually understand and take home in their own language.

Finally, we're proud that Entune is grounded in a real, documented problem. Every design decision traces back to something we've experienced or witnessed in clinics serving LEP patients: the code-switching, the culture-bound syndromes, the family member pressed into translating.

What we learned

Building Entune taught us that prompt engineering is a discipline. Getting Claude to detect cultural health concepts accurately, produce two distinct translations of the same utterance for two different audiences, and generate consistently structured SOAP notes every single time required far more iteration than we expected. We learned that bad prompt design shows up immediately in a live medical conversation where precision actually matters.

As this was most of our team members' first hackathon, we also learned what it means to build and ship within such a short time frame. Decisions that felt abstract at the start of our planning became very concrete at 2 AM. We learned to ask "does this make the demo better?" before building anything, and to ruthlessly cut anything that didn't. That discipline is why Entune works end-to-end instead of having lots of half-finished features.

On the technical side, integrating Deepgram, Claude, Supabase, and Vercel into a single coherent real-time pipeline taught us how much invisible complexity lives at the seams between services authentication, latency, data formatting, and error handling. Real-time multi-device sync through Supabase subscriptions made the two-device architecture possible, but handling edge cases like reconnection, out-of-order messages, and scroll behavior took more effort than expected.

But the deepest thing we learned was about the problem itself. We came in knowing that language barriers in healthcare were significant. We didn't fully appreciate how much of that gap is cultural rather than linguistic. A perfect word-for-word translation of a clinical conversation can still leave a patient completely unserved if the provider doesn't truly understand what hwa-byung means, or why a patient describing nervios isn't just anxious. Language is the surface. Culture is the substance. Entune exists in the gap between them.

What's next for Entune

Our most immediate priority is language expansion. Entune currently supports Spanish and Korean, but the LEP population in the US spans hundreds of languages. Deepgram and Claude both support a wide range of languages, and expanding coverage is the single highest-impact thing we can do to serve more patients.

Beyond language coverage, two features stand out as the most meaningful next steps.

The first is emotion and tone detection for underreporting. We've experienced first-hand that LEP patients underreport symptoms: from language limitations, fear, cultural deference to authority figures, or not wanting to be a burden. Entune is already listening to the conversation in real time. The next step is analyzing not just what is said, but how — hesitation, vague language, deflection, and minimization would surface a quiet alert to the provider: "Patient may be underreporting. Consider following up on this directly."

The second is informal translator detection. A recurring issue in LEP healthcare is the use of family members as medical interpreters. Entune can detect when a third voice enters the conversation or when a patient defers to an accompanying person to speak on their behalf, and prompt the provider to consider a qualified interpreter instead. It's a small intervention with serious clinical and ethical implications.

We also have the infrastructure in place to add ElevenLabs text-to-speech so translations are spoken aloud — helping patients who are more comfortable listening than reading. And longer term, direct EHR integration would let providers export SOAP notes without any copy-pasting, making Entune a seamless part of the clinical workflow rather than a separate tool.

Built With

  • anthropic
  • claude
  • deepgram
  • next.js
  • oath
  • shadcn
  • supabase
  • tailwind
  • vercel
Share this project:

Updates