Inspiration

The Institute of Medicine estimates that medication errors kill roughly 7,000 Americans every year, and injure over a million more. The people most at risk are elderly adults managing five or more prescriptions, often reading tiny label text in poor lighting without their glasses nearby. A caregiver hundreds of miles away has no visibility into whether their parent took their morning pills. A non-English speaker can't understand what "contraindicated with NSAIDs" means. We wanted to build the tool that lives between the medicine cabinet and the next doctor's appointment.

What it does

MedSNAP is a voice-first medication assistant for households. Point your phone at any pill bottle and it translates the label into plain language: what it does, how to take it, what to watch out for. It checks every new medication against everything else in your household cabinet and warns you about dangerous combinations in real time. Pour loose pills into your palm and it identifies them by imprint code against the NLM's drug database. Ask questions by voice and it answers through ElevenLabs, household questions like "did Grandpa take his morning pills?" stay on Gemma 4, medical knowledge questions go to Gemini. The home screen shows today's dose schedule with one-tap logging so caregivers always know where things stand.

How we built it

We built MedSNAP as a Next.js PWA so it works on any phone without an app store. The camera pipeline uses getUserMedia and sends frames to Gemini 2.5 Flash for multimodal label extraction and pill imprint reading. Every scan is cross-checked against three independent drug databases: OpenFDA for label verification, DailyMed for structured OTC drug information, and the NLM RxImage API for pill identification by imprint, the same database that GoodRx and WebMD use. Voice queries are routed between Gemma 4 and Gemini based on query classification, with ElevenLabs delivering the spoken response. The household medication graph lives in MongoDB Atlas, members, medications, dose logs, and interactions all as documents. We seeded the demo with 35 days of realistic dose history across a four-member family to demonstrate the adherence tracking.

Challenges we ran into

Getting Gemini to reliably extract structured data from poorly-lit, partially-visible bottle labels was harder than expected — the model would confidently return nulls for readable fields. We solved this with schema coercion and OpenFDA as a fallback verifier. Gemma 4 being a reasoning model caused a different problem: it added labels like "Final Answer:", "Draft:", and "Sentence:" before responses, and repeated itself. We wrote a post-processing layer that strips any invented prefix and deduplicates sentences. Browser autoplay policy was a significant obstacle, it was silently blocked when called inside async callbacks far from the original user gesture. We resolved this by playing a silent AudioContext buffer on the first mic tap to unlock audio for the session.

Accomplishments that we're proud of

We built a working pill identification pipeline that goes from a photo to a confirmed drug name via NLM RxImage, something most "AI medication apps" skip entirely in favor of text search. The dual AI routing between Gemma 4 and Gemini is real and visible to the user with a labeled badge on every answer. The cabinet visualization, today's dose tracker, and member color-coding came together into something that actually looks like it was designed for the user it serves, not just for a hackathon demo.

What we learned

The hardest part of building a medical tool isn't the AI, it's knowing when not to trust it. We learned to treat every AI output as a draft that needs a second source. Gemini reads the label and the FDA verifies the dosage. Gemini sees a pill; RxImage confirms the identity. That verification layer is what makes the difference between a demo and something a real person could actually use. We also learned that accessibility isn't a feature you add at the end, designing for an elderly user with reduced vision and motor control shaped every decision from font size to touch target size to the choice to make voice the primary interface.

What's next for MedSNAP

The future of MedSNAP will expand features such as refilling. Users will be able to notify their doctor to obtain another refill for their medication. We will also expand our medication and language database to include more medications as well as have the option to switch to other languages.

Built With

Share this project:

Updates