Inspiration

Over 2.2 billion people live with vision impairment. Simple tasks — reading a label, picking an outfit, taking a photo — require asking someone for help. We wanted to build a companion that sees the world with you and talks like a friend. When we saw Gemini Live API's real-time audio+video streaming, we knew we could make it happen.

What it does

Visionary is a voice-first AI assistant for visually impaired users. No buttons, no screens — everything by speech.

  • See — "What's in front of me?" Live scene description from the camera feed.
  • Read — Point at a menu or sign, it reads aloud.
  • Photo Director — AI guides framing by voice: "Move left... tilt up... perfect!" Then auto-captures.
  • Create — Generate stylized images from voice prompts.
  • Share — Post to Bluesky or hear your feed read aloud, entirely hands-free.

How we built it

Google AI Studio App Builder, Gemini CLI, Claude Code

Challenges we ran into

  • Audio timing — Browser AudioContext suspends unless created during a user gesture. Initialization order was critical.
  • The hallucination gap — AI described things before camera frames arrived. Fixed with system instructions enforcing honesty: "I'm waiting for the camera to activate."
  • Bluesky uploads failing — Camera photos exceeded the 1MB limit. Built progressive JPEG quality reduction to fit.
  • Silent deployment failure — First Cloud Run deploy had an empty API key (.env in .dockerignore, build arg not passed). App connected then immediately disconnected.

Accomplishments that we're proud of

Found the hackathon 3 days before the deadline, and made something that is working. :)

Watching someone who can't see get voice-guided to frame a perfect photo — "move left, tilt up, perfect!" — then post it to social media without touching a button. That moment proved this matters. Also: one codebase for iOS + web, and full infrastructure provisioned with a single terraform apply.

What we learned

  • Gemini Live's native audio is a step change from TTS — it feels like a real conversation.

What's next for Visionary

Android support, sign language translation, more social platforms, and an App Store release.

Built With

Share this project:

Updates