Lighthouse
A voice that checks in, so someone always knows.
Inspiration
Half of all patients don't take their medication as prescribed, and one in five ends up back in the hospital within a month, usually from something a single phone call would have caught. The reason is simple: your doctor sees you a few times a year. The other 8,000 hours, no one is watching. That gap is widest for the people who can least afford it, like those living with dementia, bipolar disorder, or depression, where a skipped dose or a few sleepless nights can quietly turn into a crisis. We wanted to fill those 8,000 hours with something that scales: a voice that checks in, so someone always knows.
What it does
Lighthouse is an AI voice agent that calls patients at home on a schedule that fits them. It has a real, natural conversation in Grok's voice: "Did you take your pill this morning?", "How are you feeling today?" It listens for what people say and for what they don't. When a patient sounds unwell, admits they skipped a dose, shows an early warning pattern (for example mood climbing while sleep is dropping, the way a manic episode begins), or simply doesn't pick up, Lighthouse acts. It texts their caregiver, raises a live alert on the doctor's dashboard, and logs the call as a transcript, a summary, and a day rating. It is wearable-aware, so it never calls at 3am or while someone is asleep or driving. There are two sides to it: a calm patient portal with no app to learn, and a clinician dashboard where every check-in lands automatically. Every patient, every day, for pennies a call.
How we built it
Lighthouse is a monorepo with two halves. The dashboard and patient portal are a Next.js app (React 19, TypeScript, Tailwind, shadcn/ui). The brain is a Node bridge (Express plus ws) that owns the hard real-time work: it places outbound calls through Telnyx, receives the bidirectional media stream as g711_ulaw, and relays the audio straight to the Grok Voice Agent API over a raw WebSocket with no transcoding, since both sides already speak PCMU at 8kHz. Grok's built-in turn detection handles endpointing. Around the live call, we use the Vercel AI SDK with @ai-sdk/xai (grok-4) for the text reasoning: post-call summaries, structured mood and risk extraction with generateObject and Zod, the dashboard copilot, and drafting the caregiver message. Escalations go out over SMS and Telegram, and the dashboard updates live through a WebSocket push. State lives in an in-memory store snapshotted to JSON, kept behind a small interface so it can become a real database later. Because the bridge needs persistent WebSocket servers, we couldn't deploy it serverless, so we run the whole stack locally and expose it through a single Cloudflare named tunnel to a real public URL.
Challenges we ran into
Twilio flagged and blocked our account mid-build, so we tore out the telephony layer and rebuilt it on Telnyx Call Control: new call placement, webhooks, and media-frame parsing. The first calls came back as garbled, pitched-up noise: a codec mismatch, with Grok emitting one audio format while the phone line expected G.711 μ-law at 8kHz. Pinning both ends to PCMU and forwarding frames untouched fixed the audio and removed the transcoding latency that had made the conversation feel robotic. Grok then stayed silent on the line until we matched its GA realtime event names. The Vercel AI SDK's speech APIs don't support xAI, so we couldn't let it own the call; we drive the Grok realtime WebSocket directly and use the SDK only for text reasoning. Telnyx trial accounts dial only verified numbers, so every live test ran against one phone. Vercel can't host a persistent WebSocket server at all, so we run the stack locally behind a Cloudflare named tunnel for a real public URL. The subtlest problem was judgment: tuning prompts and thresholds so an escalation fires on a true danger pattern without crying wolf over an ordinary bad day.
Accomplishments that we're proud of
We have a real phone that actually rings, a real conversation in a natural voice, and a real escalation, all live on a public URL, not a mockup. A real live demo where AI catches danger that sounds like good news: a patient says he feels amazing and skipped his lithium, and Lighthouse hears that, plus no sleep, plus feeling invincible, recognizes the early manic pattern, and notifies his sister and his doctor within seconds. We built two fully synchronized views that update live during a call, with zero transcoding latency in the audio path, and we made the whole thing demoable end to end in under three minutes.
What we learned
Most of the work was making independent services agree. Making Telnyx and the Grok Voice Agent API work togther: holding audio at PCMU 8kHz on both ends so nothing transcodes, reconciling two different event models, and forwarding media frames in real time in both directions. We learned to split work by what each tool is actually good at: Grok's realtime socket owns the live call (audio, turn detection, tool calls), and the Vercel AI SDK owns the text reasoning afterward; trying to make either do both was the wrong path. Tool calls turned out to be the real glue between a spoken conversation and the app: log_mood, confirm_reminder, and trigger_escalation are how a voice on the phone ends up writing to the store, lighting up the dashboard, and firing a Telegram message.
What's next for Lighthouse
Harden the foundation: move the JSON store to Postgres behind the existing interface, add real auth and audit logging, and clear HIPAA/SOC 2 and per-call consent so Lighthouse can handle real patient data. Then make the risk engine the product, per-patient baselines instead of fixed thresholds, longitudinal models over mood, sleep, and adherence that flag a relapse days before it surfaces, and confidence-scored escalations clinicians can tune. Fuse live wearable streams (Apple Health, Fitbit, RingConn) for smarter call timing and physiological signal, and close the clinical loop with FHIR/EHR write-back so summaries land directly in the chart, plus a two-way caregiver app. Push latency lower with regional media and streaming endpointing, and add multi-language and dysarthria-robust speech for the patients who need it most. The endgame is reimbursable, outcome-backed care: bill under existing Remote Patient Monitoring and Chronic Care Management codes, run a readmission-reduction study, and operate a fleet of concurrent agents with the observability to show every escalation was the right one.
Built With
- grok
- next.js
- react
- vercel-ai-sdk

Log in or sign up for Devpost to join the conversation.