Doctor's Dashboard
Schedule Follow ups
Medications overview
Patients care plan
Voice agent

CareCall

Care that follows you home.

Inspiration

Kush's grandparents don't text. They don't tap through apps or log into patient portals. That's just not how they live, and no amount of awareness is going to change it. But a phone that rings, with a real voice on the other end? That they answer. It's natural for them, and it's dignified.

His grandfather speaks Hindi, not English. So every time he came home from the hospital, the same thing happened. He would have a folder of discharge papers he couldn't read and a memory of a doctor's visit he couldn't fully follow. He'd complain, not that he was sick, but that he didn't understand. What were the pills for? What was he supposed to watch out for? Nobody was going to call and ask how he was doing. The system would only find out something had gone wrong when he showed up at the ER again.

That's not an edge case. That's the discharge gap, and it hits hardest for the people with the least room to spare: the elderly, the low-income, and anyone trying to follow care instructions written in a language or at a reading level that wasn't built for them. These patients don't skip their medications out of carelessness. They run into a wall. The prescription costs too much, the instructions don't make sense, and no one checks in.

The problem isn't that the clinician failed to explain something. It's that nobody is watching in the days right after discharge, when things quietly go off track. We built CareCall to fill that gap. It reaches people the way Kush's grandfather can actually be reached, in the language he actually speaks, and it does this without ever pretending to be the doctor.

What it does

CareCall reads a patient's discharge summary, then reaches out on a regular schedule to check that they're taking their medications and recovering on plan. It works over two channels that do two different jobs:

SMS, the friendly reminder. Short, frequent nudges in the patient's language, like "Don't forget your blood pressure medicine today."
Voice, the periodic check-in. A real spoken conversation that asks how things are going, listens, and works out whether the patient is actually on track.

Neither channel is a backup for the other. They work as a pair, and the voice call is what lets CareCall reach people that a text-only system would quietly leave out.

Each check-in does four things:

Ingest and structure. A patient or staff member uploads a discharge summary as a PDF or text. CareCall pulls the care plan (medications, procedures, warning signs) into a clean, organized format.
Reach out gently. It calls or texts in plain language, like "Did you take your blood pressure medicine today?", and listens to the answer.
Classify, not diagnose. Each answer gets sorted into an adherence status along with the reason behind it, whether that's cost, access, or confusion, captured in the patient's own words.
Surface, never overwrite. The results land on a read-only dashboard for the care team, sorted by risk with the evidence attached. A clinician reads it and decides what to do. CareCall never writes anything back to the medical record. All it knows is "the patient reported X," which is an unverified claim that should never become part of a medical record without a human reviewing it first.

There's one more thing, and it runs before any of this. Every time a patient speaks, their words first hit a fixed safety check. If someone says "chest pain" or "no puedo respirar," CareCall stops trying to be smart, reads a set safety script, and routes the patient to a human. Danger always cuts the AI out of the loop.

That's the whole idea in one line: smart about helping people stay on track, and deliberately simple about danger.

How we built it

The backend is FastAPI (Python 3.12). The frontend is React 19 with Vite, TypeScript, and Tailwind, with separate portals for doctors, patients, and the care team.

Privacy built into the architecture. We decided this as a team before anyone wrote code. An Identity Vault (built on Redis) holds all the personal information, like name, phone number, medical record number, and consent, behind a meaningless ID called a patient_token. Before anything reaches the AI model, an identity minimizer swaps every personal detail for a [PLACEHOLDER]. The result is that the model never sees who the patient is. Only a separate, authorized service can turn a token back into a real name, and it only does that to fill in the dashboard or to dial the phone.

Clinical reasoning with Anthropic Claude. Claude Sonnet 4.6 turns the de-identified discharge text into a friendly follow-up plan written at an 8th-grade reading level. Claude Haiku 4.5 is the fast "thinking" model inside the live voice agent, where every millisecond of delay is something the patient can hear.

Keeping the rawest data on-device. pdfplumber reads the text out of the discharge PDF, and a local model running through Ollama (llama3.2) organizes it. This way, the most sensitive health information never has to leave the machine.

Voice. The Deepgram Voice Agent runs the live call, using nova-3 for speech-to-text and aura-2 for the voice (thalia in English, nestor in Spanish). It runs over a WebSocket and connects to a real phone line through Twilio Media Streams. Both the outbound voice call and the SMS thread work end to end.

Risk you can explain. A fixed keyword engine handles all the escalation, not an AI model, so we can audit it line by line. Next to it sits a diagnosis-based lookup that maps the patient's main diagnosis to an ICD-10 group and pulls a 30-day readmission baseline for that group. Those baselines are calibrated to published MIMIC-IV (v3.1) cohort statistics. The lookup is wired into the live risk pipeline, so every risk score pulls its baseline as the request happens and feeds the readmission and complication scores. Every alert can be explained, instead of coming out of a black box.

Supporting infrastructure. Redis holds the clinical state and the RAG cache. SQLite stores accounts and which doctor is assigned to which patient. APScheduler fires off the recurring check-ins. A prompt compressor cuts the token cost while carefully avoiding any personal field. SMS and outbound calls run through Twilio, hardened with API-key auth, safe-number routing, and a per-patient cooldown.

English and Spanish, end to end. The prompts, the danger-phrase lists, and the voices all support both languages, and the language is treated as a setting rather than something baked in. Adding a new language is a config change, not a rewrite.

Challenges we ran into

A live outbound phone call is the riskiest thing you can possibly demo. Wiring Twilio Media Streams, Deepgram, and the text-to-speech into a real-time loop that reliably hangs up when the check-in is finished took a lot of hardening. Getting it to work on demand, not just in rehearsal, was the hardest engineering of the weekend.
Keeping the AI blind to personal information while still making the dashboard useful forced a real architecture decision. The vault, the minimizer, and the re-link boundary had to be designed in.
Resisting the urge to be helpful. The hardest discipline was making the danger path fixed and refusing to let the model judge how serious a symptom was.
Making MIMIC-IV portable. We committed the cohort baselines directly so the backend imports cleanly on a fresh clone, instead of depending on a multi-gigabyte dataset download.

Accomplishments that we're proud of

A working slice that runs end to end on both channels: upload a discharge summary, enroll the patient, run a live voice check-in and an SMS thread, classify the answers, watch the flags fill the dashboard, and trip a danger escalation on cue.
A safety design where most of what makes it safe is what it refuses to do. No diagnosis, no dosing changes, no guessing at languages mid-call, and no writing to the medical record.
Privacy: The model never sees the patient's identity, and the most sensitive data is parsed on-device.
Multilingual, built for the exact population that usually gets left out.
Risk alerts grounded in real population baselines, calibrated to MIMIC-IV and explained rather than hidden in a black box.

What we learned

In healthcare AI, drawing a hard line is what keeps the system out of dangerous, regulated territory.
Erring toward a human is the right default. A keyword check will catch some false alarms and miss some paraphrases, this is better than missing a dangerous warning.
Speech-to-text gets inaccurate in noisy situations. That's the whole reason a human reads every flag.

What's next for CareCall

FHIR and EHR integration (Epic, Cerner) to replace the upload-and-parse step. Our format is already shaped like FHIR to make the swap clean.
Red-flag libraries written by clinicians, maintained by medical staff for each language instead of hard-coded by engineers.
More languages, starting with Hindi. The pipeline is built to handle any number of locales.
WhatsApp and other channels as drop-in adapters on the same conversation core.
A richer MIMIC-grounded risk model that moves from diagnosis-group baselines toward per-patient details like other conditions, complex medication flags, and social factors we already capture.
Production-grade compliance: HIPAA-ready encryption and audit logs, business agreements with our phone, speech, and AI vendors, and consent and opt-out handling built into enrollment.

About Us

We're a four-person team:

Anuj built the UI and frontend, plus the on-device PDF ingestion pipeline.
Rey built the voice pipeline and the Deepgram integration, the real-time call loop.
Kush built the personalized care-plan generation and the readmission and risk analysis backend.
Reet built the authentication and the patient-doctor-careteam dashboards.