Inspiration

My friend is visually impaired. This is his story:

"You know," he told me once, "nowadays we use AI everywhere."
"We use it to turn bullet points into nice emails, and then the recipient uses AI to summarize the email back into bullet points."
"We use AI to generate social media posts with stunning images. We now even use it to calculate 6×7."
"And yet, there is still no app I can use to help with my daily life."

What it does

Because the official Gemini app lacks specific accessibility features for the visually impaired, I built a minimal, fully accessible app focused on grocery shopping. Users can simply take a photo, and the AI agent guides them through the shopping experience, identifying products and providing details.

How I built it

The mobile application is built using the Compose Multiplatform tech stack, allowing it to run on both Android and iOS devices. I have integrated the Gemini 3 API using the gemini-3-flash-preview model. The model is optimized with a specific system prompt to act as a specialized AI agent for visually impaired shoppers. The app also utilizes short-term memory to maintain the context of the session history.

Challenges I ran into

  • Prompt Engineering: I had to iterate several times to optimize the system prompt for this specific task.
  • Context Management: Implementing short-term memory to ensure the AI didn't lose the context of previous messages took significant time.
  • Documentation: As there was no clear documentation (or an API specification/Swagger file) for the REST API, I had to reverse-engineer parts of the integration.

Accomplishments that I'm proud of

  • The app genuinely makes the daily life of visually impaired people easier.
  • I reached out to the Swiss Federation of the Blind and Visually Impaired and they are interested in collaborating and potentially adopting this as their official app.

What I learned

  • How to build a robust AI agent using the Gemini 3 REST API.
  • How to effectively utilize system prompts and manage chat session context with memory.

What's next for Aid

The goal is to win this hackathon and invest the prize money into further development:

  • User Testing: Test the app with a wider group of visually impaired individuals and improve it based on their feedback.
  • Live API: Integrate Gemini Live API during the initial picture taking step with voice-over support.
  • Expansion: Create additional agents optimized for other daily challenges faced by the visually impaired community. Move the solution to Google's AI platform and build a multi-agent solution.

Built With

Share this project:

Updates