What inspired us We wanted grocery shopping to feel less like guessing and more like having a coach in the aisle. Too many people struggle to match products to their diet (halal, vegan, low-FODMAP), health (goals, conditions, deficiencies), and budget - all while staring at a wall of options. We asked: what if your phone (or glasses) could see the shelf, know your profile, and tell you not just what something is, but what it means for you? That led to ShelfTech: an AR-style grocery assistant that runs on a camera feed, detects products, and shows personalized health and action guidance - pairings, timing, macros when they matter - instead of generic label copy. What we learned One size doesn’t fit all. Early versions said things like “1g protein supports muscle” for an orange. We learned to treat product type first (fruit vs supplement vs protein source) and only mention macros when they’re meaningful (e.g. no “good for your bulk” on a multivitamin). That made the left-hand “Health & Action” panel actually useful. Voice in the browser is fragile. We hit “nothing happens” when using the Web Speech API: no feedback, silent errors, and empty transcripts. We added clear error messages (mic blocked, no speech, network) and only send final results to the API so answers stay reliable. Deploy ≠ local. Vercel didn’t have vite in PATH and was overriding our build. We switched to npx for the build command and added a root vercel.json so the app builds correctly whether the project root is the repo or the web folder. How we built it We built a React + Vite + TypeScript web app that: Ingests a video source - phone camera, webcam, or uploaded Ray-Ban Meta glasses video - and runs object detection (Gemini, Dedalus, or GrocerEye) to draw boxes and labels on products. Stores a shopper profile (diet, body goals, conditions, deficiencies, budget) and uses it in a small RAG-style summary so every API call is profile-aware. Overlays relevance tags on each product (e.g. “Good for cut”, “Gentle on stomach”) from an enrichment API that takes profile + product labels. Shows a left “Health & Action” panel with bullets tailored to the product and person: benefits, pairings, when to eat it, and macros only when they’re significant (with thresholds so we don’t overstate tiny amounts). Shows a right detail panel with nutrition, dietary info, and a voice Q&A (“Ask about this product”) using the Web Speech API and Gemini (or Dedalus) for answers. The stack is client-heavy: detection, overlay, and voice all call cloud APIs (Gemini/Dedalus/GrocerEye). We don’t run local models, so the same app runs on desktop and phone without losing quality. For Ray-Ban Meta, we added an upload flow: record with the glasses, upload the video, and run the same pipeline (boxes, overlays, health panel, details) on the recording so judges can see the “glasses view” even without a live glasses feed. Challenges we faced Making health copy honest and specific. We had to add product categories (supplement, fruit, dairy, etc.) and only then layer on macros, pairings, and timing. We also had to skip macro bullets for supplements and for trivial amounts (e.g. 1g protein, 15 cal) so the app doesn’t sound like a generic “everything is good for your bulk” bot. Voice recognition and UX. Recognition often failed silently. We added a “Listening…” state, surfaced onerror (permission, no-speech, network), and only used final transcripts so we don’t send half-phrases to the model. Vercel build and env. The build ran vite build and failed with “vite: command not found.” We fixed it by using npx in the build command and clarifying root vs web in vercel.json. We also had to set env vars in the Vercel dashboard (e.g. VITE_GEMINI_API_KEY, VITE_GEMINI_API_KEY) so detection works in production; local.env and of course, .env isn’t deployed. We’re proud that ShelfTech now gives personalized, product-aware guidance - what to eat it with, when it fits your day, and how it fits your goals - instead of one-size-fits-all label text.

Built With

Share this project:

Updates