.dot: bringing humanity to in-home care

Inspiration

Every year, millions of elderly patients struggle to navigate complex digital health portals like MyChart. They forget medication schedules, miss appointments, and can't easily access their own lab results — not because the information doesn't exist, but because the interfaces weren't designed for them. We watched our own grandparents fumble through tablets and phone screens just to check a prescription refill, and we thought: what if your medical records could just talk to you?

We were inspired by the idea that healthcare technology should meet patients where they are, not the other way around. For someone with limited mobility, poor eyesight, or low digital literacy, a simple voice conversation is infinitely more accessible than a multi-step login portal. We set out to build .dot — a warm, always-available voice companion that brings the power of a patient portal into a natural conversation, right at the bedside.

What it does

.dot is an AI-powered healthcare companion that lets elderly patients interact with their full medical records through natural voice conversation. Instead of navigating screens, patients simply say "Hey Jarvis" and ask questions like "When is my next appointment?" or "What are the side effects of my blood pressure medication?"

Key capabilities include:

Voice-first medical record access — Patients can ask about prescriptions, lab results, upcoming appointments, visit history, and their care team, all through real-time speech. Medication barcode scanning — Patients can scan any medication bottle with a USB barcode reader, and .dot matches it against their prescriptions, confirming dosage, refill status, and flagging potential issues. Medical information search — When patients have general health questions, .dot searches trusted medical sources (PubMed, NIH, FDA, Mayo Clinic, CDC) and relays concise, accurate answers. Emergency escalation — If .dot detects signs of a medical emergency (chest pain, stroke symptoms, difficulty breathing), it can immediately place a phone call to the patient's emergency contact or primary care provider via Twilio. It will automatically search for interactions between drugs and give patients real-time relevant advice. LED feedback — A physical LED strip provides ambient visual cues: a gentle breathing purple when idle, solid blue when listening, and a red/blue chase pattern when thinking. This gives patients a tangible sense of .dot's state without needing a screen. Full patient portal — A companion Next.js web application serves as the data backbone, with admin tools for providers to manage patient records, and a patient-facing dashboard mirroring a MyChart-like experience. Conversation logging — Every interaction is logged with full transcripts and streamed via WebSocket, directly to a mobile app that caregivers and providers can review.

How we built it

.dot is a four-layer system: a real-time voice agent, a full-stack mock patient portal, a caregiver mobile app, and a hardware feedback loop.

Voice Agent (Python): The core of .dot is a Python application that connects to OpenAI's Realtime API over WebSocket, streaming bidirectional audio at 24kHz. We use server-side voice activity detection so the agent naturally detects when the patient starts and stops speaking. The agent is configured with the patient's full medical chart injected into its system prompt, giving it grounded context to answer questions without hallucinating. We integrated the Perplexity API (using the sonar model) for medical search, filtered exclusively to trusted medical domains. For emergency calling, we built a Twilio bridge that creates real-time bidirectional phone calls. The barcode scanner pipeline queries the FDA's NDC database and UPCItemDB to identify scanned medications and cross-references them against the patient's prescription list.

Wake Word Detection: We trained a custom wake word model using OpenWakeWord, generating synthetic training samples with Piper TTS and augmenting them with noise. The model achieves 92% accuracy and 87% recall for the phrase "Hey Jarvis," running locally via ONNX inference with no cloud dependency.

Patient Portal (Next.js + PostgreSQL): We built a full-featured MyChart-style portal using Next.js 15 with the App Router, React 19, Tailwind CSS, and shadcn/ui components. Authentication is handled by NextAuth.js with credentials-based login. The data layer uses Prisma ORM with a comprehensive schema covering patients, providers, prescriptions, lab results, visits, and appointments. An admin interface lets providers create and manage patient records.

Browserbase (Stagehand): We built a Browserbase Stagehand-based browser automation pipeline that can scrape real patient data from existing MyChart portals. This runs as a scheduled CRON job and caches patient data locally for the agent to leverage as context.

Hardware (Arduino + WS2812B LEDs): We connected a 32-LED addressable RGB strip to an Arduino, communicating over serial at 115200 baud. The Python agent sends single-character commands ('1', '2', '3') to switch between animation states — breathing purple for idle, red/blue chase for thinking, and solid blue for listening. The LED animations are entirely self-contained on the Arduino using the FastLED library.

Deployment: The patient portal is deployed on Vercel with a PostgreSQL database in production, while the voice agent and hardware components run locally on a bedside device.

Challenges we ran into

Real-time audio reliability was our biggest hurdle. Streaming bidirectional audio over WebSocket with the OpenAI Realtime API required careful management of audio buffers, playback queues, and interruption handling. We spent significant time tuning chunk sizes and handling edge cases like the agent trying to speak while the patient was still talking.

WebSocket stability proved tricky — we went through multiple iterations trying to get WSS (secure WebSockets) working reliably for our conversation logging server, debugging SSL certificate issues and connection drops.

Wake word accuracy was a challenge with limited training data. Our initial models had too many false positives in noisy environments. We iteratively improved by generating more diverse synthetic training samples and applying audio augmentation techniques, eventually reaching 92% accuracy.

Barcode-to-prescription matching required stitching together multiple APIs. Medication barcodes don't use a single universal standard — we had to handle NDC codes, UPC codes, and Rx numbers, falling back through multiple lookup services to reliably identify a drug and match it to the patient's records.

Prompt engineering for safety was critical. We needed the agent to be helpful and conversational while strictly refusing to diagnose conditions or prescribe treatments. Balancing warmth with medical safety guardrails — and getting the agent to consistently detect emergency situations without false alarms — took extensive iteration.

Accomplishments that we're proud of

End-to-end voice interaction with real medical data — A patient can walk up to .dot, say "Hey Jarvis," ask about their medications, scan a pill bottle for verification, and get grounded answers from their actual medical records, all without touching a screen. Emergency escalation that actually calls — .dot doesn't just alert; it places a real phone call to the patient's emergency contact or doctor via Twilio, bridging the gap between detection and action. Custom wake word model — We trained our own "Hey Jarvis" model from scratch with 92% accuracy, running entirely on-device with no cloud dependency. A complete, functional patient portal — Not just a mockup, but a full-stack application with authentication, role-based access, CRUD operations, and realistic seed data that mirrors a real MyChart experience. Physical hardware integration — The LED strip transforms .dot from software into something tangible. The ambient breathing animation gives elderly users a gentle, non-threatening presence in their room, and the visual state feedback eliminates the "is it listening?" uncertainty. Conversation logging and auditability — Every interaction is recorded and streamable in real-time via WebSocket, enabling caregivers to review what their loved one asked about and how the agent responded.

What we learned

Voice-first design requires a different mindset. When there's no screen to fall back on, every interaction must be self-explanatory. We learned to design for brevity — the agent responds in 1-2 sentences — and for recovery, gracefully handling misunderstandings. Medical AI needs hard guardrails, not soft ones. It's not enough to tell the model to "be careful." We had to explicitly enumerate what the agent must never do (diagnose, prescribe, minimize symptoms) and build structural safeguards like the emergency detection system. Hardware makes software feel real. Adding the LED strip was a relatively small engineering effort, but it completely changed how people perceived .dot. It went from "a program running on a laptop" to "a thing in the room that's alive." Real-time APIs are powerful but unforgiving. The OpenAI Realtime API enabled an experience that would have been impossible with traditional request-response patterns, but debugging streaming WebSocket audio with concurrent playback, recording, and function calls pushed our understanding of async programming. Data plumbing is underrated. Connecting the barcode scanner to the FDA API to the patient's prescription list to the voice agent's context required careful data normalization across multiple formats and APIs. The "boring" integration work was some of the most important.

What's next for .dot

Integration with real EHR systems — Moving beyond our mock portal to connect with actual Epic MyChart and other EHR platforms via FHIR APIs, so .dot can serve real patients with real medical data. We have already created an OpenEMR integration pipeline which is populated with synthetic medical data from Synthea and attached video of the integration pipeline to github repo readme file. Multi-language support — Many elderly patients are more comfortable in their native language. We want to add multilingual voice interaction so .dot can converse in Spanish, Mandarin, Tagalog, and other languages common in home care settings. Caregiver dashboard — A real-time web interface for family members and home health aides to monitor .dot conversations, receive alerts, and track medication adherence patterns over time. Clinical validation — Partnering with home health agencies and geriatric care providers to pilot .dot with real patients, measuring outcomes around medication adherence, emergency response times, and patient satisfaction. HIPAA compliance and security hardening — Implementing end-to-end encryption, secure credential storage, and audit logging that meets healthcare regulatory requirements for real-world deployment.

Built With

anthropic
arduino
barcode-scanner
browserbase
claude-code
jetson-nano
jupyter-notebooks
microphone
openai
perplexity
python
react
shell
speaker
typescript
vercel
websockets

Submitted to

TreeHacks 2026
- Winner [Zingage] Best Voice AI for Healthcare (Airpods Max + fast-track to superday with Zingage)

Created by

I built the core agent, physical enclosure, and electronics

Leo Marek
Electrical and Computer Engineering and CS at Rice University. DSP, ML, AI, and more.
Raiyan Haque
CS @ Stanford | Incoming SWE @ Salesforce
Akhilesh Bitla
Omar Khamis