Inspiration
We wanted to build something that reflects Technica’s mission of expanding inclusion through technology. Our goal was to design a tool that helps people with limited or no vision better interact with their environment. By using the camera to interpret the world and converting those visuals into spoken descriptions, we created a way to make everyday surroundings more accessible. We were inspired by Technica's mission!
What it does
Seeing Sound is an image-to-voice description tool. It takes in an image, analyzes what it contains, and verbally communicates the key details to the user, providing a clearer understanding of their environment.
How we built it
Our team divided responsibilities based on individual strengths. Ashley and I developed the backend, Naima worked on the hardware integration, and Ashi built the frontend interface. We collaborated through GitHub to merge our work and bring everything together into a fully functional application.
Challenges we ran into
Working remotely posed communication challenges, especially since I was located in New York. However, with consistent coordination through GroupMe and Google Teams, we were able to stay connected, problem-solve together, and keep the project moving!
Accomplishments that we're proud of
Hearing the application speak its first image description was an incredible moment. Watching it successfully interpret a photo and express it aloud truly made the project feel impactful and real.
What we learned
We gained experience integrating multiple APIs from an AI service into our own codebase, turning them into a cohesive, functioning product. It strengthened our understanding of backend development, hardware interaction, and cross-team collaboration.
What's next for Seeing Sound
In the future, we hope to incorporate real-time camera use and motion-activated detection so the tool can describe surroundings continuously as they change.
Built With
- java
- javascript
- logitech
- python
Log in or sign up for Devpost to join the conversation.