Hear 2 See

Home Page
Showing Image Description From Image Capture
Image Upload and Showcase of Image Description
Contact Us

Inspiration

The inspiration behind Hear2See is to unlock a world of possibilities for the blind by providing them with a new way to understand and interact with their surroundings. We were driven by the desire to empower visually impaired individuals, promote inclusivity, and enable them to make informed decisions and engage more fully in their daily lives.

What it does

Hear2See is an innovative assistive technology project that leverages image recognition technology and audio synthesis to transform images into meaningful sounds. Users can capture images in real-time or upload existing photos, and the system provides instant audio feedback, allowing the visually impaired to navigate their environment more effectively and gain a deeper understanding of the world around them. It's a tool that serves as a bridge to independence and inclusivity, empowering the blind to enjoy richer experiences.

How we built it

Hear2See was built using a combination of image recognition technology, audio synthesis, and artificial intelligence. Here's how it works:

Image Capture: Users can capture real-time photos using their device's camera. The captured images are processed and described in real-time. -Image Upload: Users can upload images from their device's personal collection. The uploaded images are processed and described, providing valuable feedback. -Enhanced Descriptions: The system uses various image recognition services, including Google Cloud Vision and Azure Computer Vision, to generate textual descriptions of the images. It then uses OpenAI's language model to enhance and consolidate these descriptions, creating concise, informative summaries. -Audio Feedback: The enhanced descriptions are converted into speech using Google Text-to-Speech (gTTS) and played back to the user, making it a powerful tool for visually impaired individuals to understand their surroundings. -Contact: Users can get in touch with the project team by sending emails directly from the web application.

Challenges we ran into

Developing Hear2See presented several challenges, including:

Image Recognition: Ensuring accurate image recognition to provide meaningful audio descriptions.
Real-Time Processing: Implementing real-time image processing and audio synthesis on mobile devices.
User Experience: Designing an intuitive user interface for blind users.
AI Training: Training the AI model to create specific descriptions.

Accomplishments that we're proud of

We are proud of several accomplishments with Hear2See:

Empowering the Blind: Providing visually impaired individuals with a unique tool for understanding their surroundings.
Real-Time Feedback: Achieving fast and accurate image-to-audio processing.
AI Adaptation: Developing a system that continually learns and adapts to individual user needs.
Promoting Inclusivity: Creating a project that promotes inclusivity and independence.

What we learned

Throughout the development of Hear2See, we learned valuable lessons about assistive technology and inclusivity, including:

The importance of user-centered design.
The power of technology to break down barriers.
The potential for AI to adapt to individual needs.
The positive impact of technology on people's lives.

What's next for Hear To See

The future of Hear2See holds exciting possibilities, including:

Creating glasses with a camera built-in, allowing users to wear them and click a button on the glasses which will capture the surroundings and play the audio into their ears using speakers in the glasses in real time.
Expanding the range of recognized objects and scenes.
Developing a more seamless and intuitive user experience.
Exploring partnerships to reach a wider audience.
Researching advancements in AI and image recognition technology to enhance accuracy and speed.
Empowering even more visually impaired individuals around the world to experience the world with greater confidence and understanding.

Built With

azure
azure-vision
google-cloud
google-cloud-vertex
gtts
openai
pillow
python
smtplib
spyder
streamlit
vertex
visual-studio

Submitted to

HackUTD X

Created by

I worked on the back-end dealing with Google Cloud-Vertex, Google Text-To-Speech, emailing, and more. I used python. Google Cloud was intimidating at first, but was great to use.

Rushil Patel
Sriteja Vemugunta
Ayush Shivhare
Shriram Janardhan

Updates

Rushil Patel started this project — Nov 05, 2023 12:01 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.