Delphi

Home Page
Settings
Image 1
Response 1
Image 2
Response 2
Camera Help

📱 AI-Powered Vision Assistant

A mobile app where blind and low-vision users can capture their surroundings, and the app will speak aloud the information found in the scene, such as signs, flyers, labels, and other textual content, giving users real-time spoken insight into the visual world around them.

🔭 Current Scope

Capture a single image using the phone’s camera
Send the image to a backend server via MCP
Backend server sends data to Gemini API
Extracted information is returned as JSON
App reads the text aloud via Text-to-Speech (TTS)

💡Inspiration:

This idea came from a mix of things we’ve learned and observed.

In our digital design classes, we often talked about the importance of accessibility, designing not just for the majority, but for everyone. That got us thinking about how technology could help people with visual impairments experience the world more independently.

We imagined what it’s like to walk down the street and not be able to read signs, flyers, or menus. Something as simple as knowing what a sign says shouldn't be a barrier. We wanted to build a tool that could give that information back, through audio, in a way that’s fast, intuitive, and empowering.

Greek Goddess Theme ✨✨

The name Delphi connects to the ancient Greek Oracle of Delphi, known for providing guidance and insight. Similarly, the app acts as a “modern oracle” for blind and low-vision users, translating visual information into spoken guidance about their surroundings.

🧠Challenges:

Understanding User Needs:

None of us are blind or low-vision, so we had to empathize deeply to design appropriately
Research and accessibility guidelines helped, but real user feedback is critical

Back-End:

First time working with MCP, learning curve in understanding server communication and image handling
Troubleshooting Gemini API integration and output formatting

Front-End: