Inspiration
A few months ago I sat at the kitchen table with my mom, a stack of her medical and insurance bills spread out between us. We were trying to make sense of it: line items with codes neither of us recognized, charges that didn't match what she remembered happening, an Explanation of Benefits that might as well have been written in another language. We spent an evening on hold, getting transferred, repeating her account number, trying to ask the right questions without knowing what the right questions even were.
What stuck with me wasn't that the information was hidden — by law, most of it is available. It's that it was unreadable, and the people who most need to push back on a bill are usually the least equipped to. My mom is sharp and capable, and the system still made her feel powerless. That's the gap I wanted to close: not more data, but the ability to actually act on it.
What it does
Veto lets you talk to your medical bills. You upload a bill or your discharge paperwork, and Veto reads it, audits every charge against fair-price benchmarks, and flags likely overcharges and duplicate charges. Then you just talk to it: "Why is this so expensive?" "Did I get charged twice?" It walks you through the line items and asks whether you actually received each one — and anything you don't recognize gets folded into an appeal letter it drafts for you, in plain language, in your language. For discharge papers, it explains your instructions and builds a simple medication schedule. Everything is voice-first, because the people who need this most shouldn't have to fight a dashboard too.
How I built it
Veto runs on Next.js (deployed on Vercel) with all AI on xAI's Grok: Grok vision parses the document into structured line items, the Grok Voice Agent API (realtime speech-to-speech) is the primary interface, and Grok chat generates the appeal letters and care summaries.
The core design decision: the AI never invents a number. A deterministic layer — not the model — benchmarks each charge against a fixed reference anchored to Medicare rates, and flags anything that exceeds a multiple of the allowed amount, the same heuristic professional billing advocates use:
$$\text{flag if } p_{\text{billed}} > k \cdot r_{\text{medicare}}, \quad k \in [3,4]$$
I also added a bounded out-of-pocket estimate so a patient can see their realistic share of a bill, computed deterministically from three inputs — remaining deductible $D$, coinsurance $c$, and remaining out-of-pocket max $M$:
$$\text{share} = \min!\Big( \min(C, D) + c \cdot \max(C - D,\, 0),\; M \Big)$$
The voice agent only ever speaks numbers the deterministic layer produced, so the conversation feels natural but stays grounded in auditable math.
Challenges I ran into
- Keeping the AI honest. A hallucinated price in a billing app is worse than no app at all. I solved it by moving every number out of the model and into deterministic code, with strict prompt rules backing it up.
- Realtime voice. Streaming microphone audio over a WebSocket with low latency and clean turn-taking is hard; I built a text-chat mode on the same agent as a reliable fallback for noisy rooms.
- Drawing an honest line on "phantom" charges. The app can't know what happened in the exam room — only the patient does. So instead of accusing, it asks "do you remember getting this?" and frames disputes as the patient's own, never as a claim the app makes.
What I learned
- Voice changes who technology is for. My mom would never open a billing dashboard — but she would absolutely talk to one. That reframed the whole project.
- In health AI, the hard part isn't capability, it's trust and guardrails. The restraint (refusing to guess) is the feature.
- Scope is a strategy. One flawless flow beats five half-built ones, so I shipped the bill experience end to end before adding anything else.
What's next
Estimating what a visit or test will cost before you go, a native mobile app, and pointing the same voice-grounding engine at other paperwork patients can't read — lab results and, eventually, genomic reports. The goal stays the same: turning confusion into agency, one conversation at a time.
Built With
- grok
- grok-vision
- grok-voice
- javascript
- next.js
- node.js
- react
- swift
- tailwind
- typescript
- vercel
- websocket
- webspeech
- xai
Log in or sign up for Devpost to join the conversation.