Inspiration
Our passionate team of four engineers holds great interest in robotics, programming, and business analytics. When our team learned that the Purdue BoilerMake XII was hosting a category for the use of Gen AI, the first idea that we came up with was using Gen AI to provide accessibility support to the differently abled. Taking inspiration from this one incident where one of our teammates ran into a completely blind individual at the Lindner Center of Business. Seeing firsthand the challenges they faced in everyday tasks, our teammate realized how grateful he was for his 20/20 vision because imagine waking up every day in complete darkness. No way to read a menu, recognize faces, or even tell if you're holding a $1 or a $20 bill. This is the daily reality for over 300 million visually impaired people worldwide, and existing solutions are slow, expensive, and outdated. That’s why our team decided to engineer a product for the visually impaired. This product would help such individuals to experience their surrounding world with increased immersion.
What it does
Our solution, VisionMate, utilizes the power of Google Gemini 1.5 Flash to analyze images of the environment in front of the user and generate detailed descriptions. These descriptions are then converted into speech using Google Text-to-Speech, which is heard by the user through bone-conducting speakers. VisionMate can also differentiate between currencies and denominations, translate text into the user’s preferred language, describe art abstractly, and much more!
How we built it
Coming up with the design for the actual product was the tougher part in this project. As freshmen with zero experience or knowledge about the needs and wants of the visually impaired, we faced difficulty at the very start as we were unable to reach mutual conclusions regarding the product features. But after reaching out to a few institutions and doing some research, we were finally able to settle on a design that uses a camera mounted in the center of a pair of sunglasses, powered by a Xiao ESP 32 S3 Sense board.
Challenges we ran into
The first challenge we ran into was finding out that the microcontroller we had available with us would not be able to control the I2S amplifier. This left us with having to use the bulky ESP32 Dev Module. The second challenge we faced was figuring out how bone conduction transducers work and a way to amplify the sound waves for a louder output. Next challenge that we faced, had us scratching our head, trying to fit all the last minute electronics and circuits on a pair of sunglasses.
Accomplishments that we're proud of
We used the styrofoam cups available at the hackathon venue to successfully amplify the sound of the bone conducting speakers without having to lose any sense of hearing. This was very important as our research revealed that loss of one sense results in other senses being heightened. We did not want to take away this heightened hearing sense by having our user use earphones or headphones. Using styrofoam also helped us solve the last challenge mentioned above by allowing us to balance the weight of the components equally on both sides. But the most important accomplishment that we are proud of is not that we were able to understand the complex workings of generative AI or even successfully integrating it into our product, but rather that we were able to contribute towards the solution to a problem that affects over 300 million people worldwide.
What we learned
Through this project, we were able to gain immense experience in using Gen AI APIs, integrating them with such a product while building all of this into a hardware end-product. We also learnt the importance of user-centric design, importance to detail and magnitude of the impact that our solutions hold in the real-world.
What's next for VisionMate
We wish to develop this into a more polished product, with a compact design and affordable pricing while also being open to new horizons. With no financial barriers, this product holds great potential to positively impact the real-world in major ways.
Built With
- google-gemini
- google-text-to-speech

Log in or sign up for Devpost to join the conversation.