VisualSynth

Multi Language
App logo
Different language output 1
Different language output 2
Different language output 3
Output snippets

Inspiration

Our journey began with a visit to a school for disabled children and a heartfelt interaction with them about their disabilities. During this visit, the candid and immediate conversations with these inspiring young individuals sparked a deep desire within us to make an impact of difference in their lives. This experience was not just eye-opening—it was transformative, forcing us to think creatively about how technology could bridge gaps in their everyday experiences.

What it does

VisualSynth opens a new world for those who see it differently. It is designed for visually impaired people and anyone who struggles with recognizing their surroundings. Here's how it works:

Open the app: A user-friendly interface greets every user.

Activate the camera: With a simple tap, the camera comes to life, ready to capture the world in front of the user.

Image capture to audio transformation: Using the Gemini API, the app quickly converts captured images into text and then into clear, understandable speech.

This rapid conversion allows users to "hear" their environment, providing an audio description of things like signs, menus, and even facial expressions.

How we built it

We chose Flutter as our development framework due to its versatility and support for both Android and iOS platforms from a single codebase. This decision enabled us to focus more on enhancing the app's features rather than dealing with platform-specific issues.

At the heart of VisualSynth is the Gemini API, which performs the important task of transforming visual data into text and then speech, making real-time assistance possible.

Challenges we ran into

Transforming VisualSynth into a multilingual assistant was our steepest challenge. We wanted to ensure that no one was left behind due to language barriers.

One more, minimizing the processing time to deliver almost instantaneous feedback was essential, as any delay could disrupt the user experience.

Accomplishments that we're proud of

We are immensely proud of how VisualSynth can became a tool that promotes inclusion and accessibility. Making it multilingual was not just a technical challenge but a commitment to inclusivity, ensuring that users from different linguistic backgrounds could benefit from our app.

What we learned

Our initial research involved diving into the day-to-day experiences of visually impaired individuals, which revealed the importance of sound in their spatial awareness and safety. This insight was invaluable.

Working with Google AI Studio and Gemini API not only enhanced our technical skills but also taught us about the potential and impact of AI in real-life applications.

What's next for VisualSynth

The future of VisualSynth is filled with exciting possibilities. We plan to expand its capabilities by integrating video processing for a more dynamic feedback system and exploring the potential of AR/VR and IoT to create more immersive experiences.

These advancements will not only enhance the practical utility of VisualSynth but also deepen its impact on users' lives.