Inspiration## Inspiration## Origin
I attended Dr. Yarrington’s Assistive Technology workshop at HenHacks, and it ended up shaping everything I built that weekend. The session introduced me to challenges I hadn’t previously considered, and afterward I spoke with Dr. Yarrington directly.
We talked about the everyday frustrations visually impaired people face — not dramatic emergencies, but small, constant obstacles:
- Dropping your keys
- Misplacing your wallet
- Not remembering where you set down your medication
For a sighted person, these are three-second problems. For someone who can’t see, they can become prolonged and stressful experiences. That conversation planted the seed for BlindSpot.
The Problem: Convenience vs. Privacy
As I researched existing solutions, one concern kept resurfacing.
Apps like Be My Eyes connect blind users to a live human stranger who sees through their camera in real time. On the surface, this sounds helpful. But consider what that person might see:
- Your home environment
- Documents on your table
- Prescriptions
- Bank statements
- Personal belongings
The most widely used assistive tools often rely on giving strangers real-time visual access to private spaces. For users who may be elderly, living alone, or managing sensitive medical conditions, this introduces a serious privacy concern that is rarely discussed.
The Solution: BlindSpot
BlindSpot replaces the human intermediary with AI.
Instead of calling a stranger, a user can simply say:
“Hey BlindSpot, where are my keys?”
The app uses the phone’s camera and Google Gemini’s vision AI to scan the environment and provide natural voice guidance:
“About 3 feet ahead at your 2 o’clock, next to the chair leg.”
Key Differences
- No stranger
- No phone call
- No waiting
- No external human viewing your space
Privacy is enforced at the AI level. Gemini is instructed to:
- Never read or repeat visible text
- Ignore documents, screens, ID cards, and prescriptions
- Alert the user if sensitive items are detected
The user remains fully in control of their environment.
Why It Matters
BlindSpot was born from a single conversation at HenHacks. That’s what made the experience meaningful — exposure to someone who understands the problem deeply can redirect what you build.
If BlindSpot reaches even one person who needs it, then it has served its purpose.
How We Built It
Python and FastAPI handle the backend — one file, two endpoints. The frontend is vanilla HTML, CSS, and JavaScript with no frameworks, keeping the build dead simple. Google Gemini 2.5 Flash processes each camera frame alongside the user's query and returns structured spatial guidance. ElevenLabs converts that guidance to natural speech, with the browser's Web Speech API as a zero-latency fallback. A continuous SpeechRecognition loop runs in the background listening for "Hey BlindSpot," and a cloudflared tunnel exposes the local server over HTTPS so iOS camera access works on phone.
Challenges We Ran Into
iOS Safari blocks audio playback unless it originates from a direct user tap — but a voice-first app for blind users can't always wait for a button press. Getting ElevenLabs audio to play after a wake word (with no tap) required building a persistent audio element unlock pattern triggered on any first touch anywhere on the screen. iOS speech recognition also mis-transcribes "Hey BlindSpot" in surprisingly many ways, so we built fuzzy matching across a dozen phonetic variants to catch them all.
Accomplishments That We're Proud Of
Privacy enforcement isn't a setting or a toggle — it's baked into every single AI call at the prompt level. Gemini is instructed to never read or describe any text in the frame, and actively flags sensitive items by category without revealing content. The wake word listener runs continuously from page load with no tap required, making the experience genuinely hands-free. The whole thing runs in a browser with no app install.
What We Learned
The best ideas at a hackathon don't come from challenge prompts — they come from real conversations with people who live the problem. Dr. Yarrington's workshop gave us more direction in one hour than a week of browsing problem statements would have. We also learned that building for accessibility means building with accessibility — every button is ARIA-labeled, every response is announced to screen readers, and the UI works without ever looking at the screen.
What's Next for BlindSpot
The phone is a proof of concept. The real target is Meta smart glasses — always on your face, always listening, always ready to help without pulling out a device. Next steps are reducing response latency, adding conversational memory so the AI can say "getting warmer" as the user moves, and exploring on-device wake word detection so the microphone never has to send audio to a server.
Built With
- claude
- elevenlabs
- fastapi
- gemini
- javascript
- python
Log in or sign up for Devpost to join the conversation.