Shopper Buddy

Short Pitch

Every year, millions of people with vision loss lose something most of us take for granted: the ability to walk into a store and shop for themselves.

They depend on someone else to read labels. To count change. To decide what goes in the basket.

That's not independence. That's a workaround.

Today, we're introducing Shopper Buddy.

Point your phone at a shelf and tap the button. Our AI reads the product: name, brand, price, and speaks it to you instantly. Tap, it's in your basket.

Shopper Buddy is a button-first, camera-powered shopping assistant for the visually impaired.

It uses AI to see what you can't, recognising products from a live camera feed using multimodal vision and embeddings. One tap triggers a scan. The result comes back as speech. No screen-reading required. No typing. No waiting.

You can also hold the button and speak — to add items, check your basket, or pay. But the button is always in control.

One button. Total independence.

"The most powerful thing we can build is something that gives someone their autonomy back."

Technical Details

Core Pipeline

User taps button → camera captures frame → image sent to AWS Bedrock
Claude 3 Haiku (vision) extracts: brand, name, quantity, packaging, colour, label text
Amazon Titan Embed Text v2 converts extracted text into 256-dim vectors
RAG (Retrieval-Augmented Generation): vectors queried against a pre-embedded product catalogue via cosine similarity search
Product catalogue built from CSV data covering the 5 largest supermarket chains in the Netherlands (last updated March 2026)
Matched product (name, brand, price) spoken aloud via OpenAI TTS
If confidence < 0.5, spoken as a probable match with disclaimer

Speech

Voice input: hold button → OpenAI Whisper (STT, batch on release) → transcript
Voice output: OpenAI GPT-4o Realtime API → streaming PCM audio playback
Intent parsing via rule-based situation graph

Basket & Payment

Tap to count quantity (TTS counts each tap aloud); 2.5s silence auto-confirms
Voice commands: add, remove, read basket
Bunq Banking API (live balance check)
Warns via TTS if basket exceeds available balance

UI

Single large button occupies bottom 30% of screen — the entire interaction surface
Live camera feed top 70%; minimal overlay with basket count + total
Dark, high-contrast theme; designed to be used without looking at the screen
Mobile-first (max 480px), deployed on Vercel (serverless)

How AI is used

AWS Bedrock powers the multimodal product recognition pipeline. We route every camera frame through Anthropic Claude 3 Haiku (via Amazon Bedrock) to extract product attributes — brand, name, quantity, packaging, colour, and label text — then embed them with Amazon Titan Embed Text v2 (also via Bedrock) into 256-dimensional vectors for cosine-similarity search against our pre-computed Dutch supermarket catalogue. The non-text modality is image: no barcode, no manual input, just a photo.

On the output side, audio replaces the screen entirely: OpenAI's Realtime API streams speech back to the user, while OpenAI Whisper adds a second audio modality for hands-free voice control. The result is an image-in, audio-out loop that requires no visual literacy to operate.

What's Next

As a next step, we envision partnerships with supermarket chains to integrate directly with their live product databases: real pricing, real stock, real aisle locations. On the product side, we aim to add features such as allergen and dietary alerts spoken automatically on scan and multi-language support beyond English.

Longer term, we see Shopper Buddy expanding beyond grocery retail into pharmacies, clothing stores, and any environment where a label stands between someone and their independence.

The technology is ready. The partnerships are next.