FrontDesk

Inspiration

We kept coming back to one statistic: the average US primary care physician spends 27% of their workday on documentation and administrative tasks — more than patient contact time. Meanwhile, virtual care platforms mostly digitized the waiting room, not the work. We asked: what if the front desk was an AI that never clocks out, never puts you on hold, and hands the clinician a brief so tight they can clear a queue of cases in a lunch break?

What It Does

FrontDesk is a virtual care platform with three components working in concert:

Voice & Web Intake. Patients describe their situation — by typing a form or literally speaking to an ElevenLabs conversational AI agent. The system accepts both paths identically.

AI Triage Engine. The moment a case is submitted, a rules-first + LLM pipeline runs: deterministic safety rules screen for emergency keywords (chest pain, stroke, suicidal ideation) and high-risk histories; Gemini 2.5 Flash then generates a structured clinical brief — patient summary, symptom timeline, relevant history, confidence score, and risk flags. Every case is sorted into one of three lanes:

Auto-resolvable — straightforward, AI-drafted response; clinician reviews and approves in one click.
Batch block — needs a brief 2–5 minute 1:1; patients book into the next scheduled clinic window; provider runs Jitsi breakout rooms in rapid succession.
Escalate — immediate urgent care redirect.

Provider Workspace. Clinicians see a focused two-pane decision screen: AI clinical packet on the left, approve/modify/escalate/reject panel on the right. No navigation overhead. A case auto-loads after each decision. Providers can also run live clinic blocks — pulling patients into Jitsi breakout rooms one by one while seeing the pre-briefed packet for each.

Every decision is recorded in an append-only audit trail. The AI never makes a regulated clinical decision — only a licensed clinician transitions a case to approved, modified, escalated, or rejected.

How We Built It

Next.js 16 (App Router) + React 19 + TypeScript for the full-stack application
MongoDB Atlas for case storage, audit events, and scheduling
Google Gemini 2.5 Flash for narrative generation and dynamic follow-up questions, called with structured JSON schema output
ElevenLabs Conversational AI for voice intake — embedded widget and a HMAC-verified post-call webhook that feeds transcripts into the same pipeline as the web form
Jitsi for live clinic breakout rooms
Tailwind CSS v4 + shadcn/ui for the interface
Stateless JWT sessions (jose + Node built-in scrypt) for auth

The AI engine is behind a swappable interface: a fully deterministic mock engine makes the entire app runnable offline for demos; flipping AI_ENGINE=gemini routes to real Gemini with automatic fallback to mock on any error.

Challenges We Ran Into

Safety architecture. We spent more time on what the AI is not allowed to do than what it does. Letting an LLM decide escalation is indefensible — "the model thought it was fine" cannot be an audit response. We ended up with a hard separation: deterministic pure functions own all safety-critical decisions (emergency keyword detection, high-risk history flags, confidence thresholds); Gemini only generates narrative. That constraint was the right call but took real design discipline to enforce.

Voice → structured data. ElevenLabs returns a raw transcript, not a form. Reliably parsing free-form speech into a structured IntakeResponse (symptoms, duration, severity, medications) without losing clinical detail — and without requiring a second AI call that could introduce latency — required iterating the parsing logic more than expected.

Clinician UX for speed. Building the review queue so a provider could process a backlog without any navigation friction meant rethinking the layout entirely. The two-pane focused screen with auto-advance was the fourth design iteration; earlier versions still felt like a form.

Keeping offline mode honest. The mock engine had to generate realistic-looking but clearly artificial data so demos didn't mislead judges, while still covering all state machine paths. The seed script resets to the same deterministic state every time.

Accomplishments That We're Proud Of

The safety story is actually defensible. Every escalation decision is rule-derived and traceable to a specific flag — not a model probability. That distinction matters in a regulated domain.

Voice and web intake are the same pipeline. There is no forked code path for voice cases. The ElevenLabs transcript feeds runIntakePipeline() identically to the web form. The provider never knows how the patient submitted — and shouldn't need to.

The platform runs fully offline. No API keys required; mock engine produces valid packets across all lanes. A judge with no internet could run the full demo in under two minutes with npm run seed.

The state machine encodes the compliance boundary. The regulated transitions (in_review → approved/modified/escalated/rejected) are enforced at the database layer with a role check. It's not a UI convention — the server rejects any attempt to make a clinical decision without role: "clinician".

What We Learned

Rules and LLMs have different jobs, and conflating them is dangerous. The instinct to let the model "just handle" triage is attractive — it feels more flexible. But the moment you hand safety decisions to something you can't fully audit, you've traded correctness for capability. For a domain like healthcare, the right architecture is: rules own the safety rail, LLMs own the language.

We also learned how much of the clinician's cognitive load is not clinical. Contextualizing a new patient, synthesizing their history, formulating a structured question — that's all pre-work before the clinical judgment begins. Automating that pre-work is where the real time savings are. The AI brief doesn't replace the clinician; it makes every minute of clinician time actually clinical.

What's Next for FrontDesk

Real EHR integration. Pulling existing patient history from Epic or Athena would close the loop — the AI packet would include documented allergies, prior diagnoses, and labs rather than relying only on patient self-report.

Prescription and lab ordering. Clinician decisions currently produce free-text responses. The next step is structured output: e-prescribe directly from the approved case, or trigger a lab order from the decision panel.

Clinician performance analytics. The audit trail captures every decision and the AI's prior recommendation. That data can surface disagreement rates, average review time per lane, and confidence calibration — helping health systems understand where AI suggestions are trusted and where they're routinely overridden.

Multi-specialty routing. Today the triage rules are generalist. Adding specialty-specific intake forms and routing logic (dermatology photos via upload → derm queue; cardiology history patterns → cardiology block) would let a single platform serve an entire health system rather than one practice.

Asynchronous messaging. For cases where the clinician needs more information, the current flow sends a "needs info" status and the patient re-submits. A structured back-and-forth messaging thread inside the case would be faster and maintain the full context chain.