VoiceAid

Home page

Inspiration

Seeing the elderly struggle with using digital platforms due to visual impairments, It gave me an idea. I developed a web platform where users could upload documents or images. Gemini AI processes these uploads and communicates the content to the user through speech. I believe that if I continue to build on this idea and project, this tool can help many visually impaired individuals or anyone who can't read all the 'fine print' in some documents.

What it does

This website takes in PDF or DOCX documents and extracts the text from them using different JS libraries. It then uses that text as the input prompt for the AI alongside the user's voice input. The user can also ask questions regarding specific parts of the input text to answer by just talking. If an image is uploaded, the code automatically uses the Gemini-Pro-Vision model to "look" at it and then the user can ask questions about it.

How I built it

This was built using Javascript, HTML, and CSS, and intended to be hosted online on a static website.

Challenges I ran into

Using a brand-new technology came with multiple errors and bugs. One notable challenge was to get the program to automatically switch models depending on the type of input the user gave the program. Now most of the errors are gone and it should work flawlessly.

Accomplishments that we're proud of

Learned how to use Gemini API

What I learned

I now learned how the power of AI can be harnessed, from how to switch models and provide different types of input.

What's next for VoiceAid

With the newer Gemini models, many more features are surely to come to improve the quality of life of the end user, from larger file uploads, to video recording uploads and more.

Built With

ai
api
css3
gemini-ai
html5
javascript

Updates

sankeerthikan nimalathas posted an update — Apr 06, 2024 10:40 AM EDT

Added a feature where the character count of the document + user input is displayed on the site. Added a button to lead users to get their API key if they don't already have one.

Log in or sign up for Devpost to join the conversation.

sankeerthikan nimalathas started this project — Apr 05, 2024 10:14 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.