💡 Inspiration

We were inspired to help people who have visual impairments or use screen readers, so we built a program that allowed for them to see and hear the alt text of an image.

📝 What it does

Altify allows users to upload an image and receive an auto-generated and narrated alt text for that image. Our website uses image segmentation to classify objects, and then it reads a sentence out loud explaining what could be in the image.

⚙️ How we built it

To implement the image recognition aspect, we used the pre-trained Coco SSD model to predict the top three most likely objects in the image. We then used Speech Synthesis to provide our website with the functionality of reading alt text out loud.

🧐 Challenges we ran into

The model we initially wanted to use (Resnet hourglass) did not work well in JavaScript, so we had to find a new model to utilize. We then tried to use the MobileNet model, but we found that it wasn’t very accurate and could only support objects with one image. Finally, we settled on the Coco dataset which allowed us to identify multiple objects in an image. Once we selected our model, we struggled with making the image uploads and predictions happen synchronously. This is due to the fact that a lot of our versions would try and predict text for an image that hadn’t loaded yet.

🥇 Accomplishments that we're proud of

We are proud that we were able to get the image prediction feature to work as well as a sophisticated user interface that also supports multiple window sizes.

📚 What we learned

We learned how to use pre-trained models to predict images and how to synchronize things in JavaScript. Additionally, we delved into the implementation of text-to-speech technology, gaining a deep appreciation for the significance of catering to diverse user needs and advancing accessibility on the internet.

🔮 What's next for Altify

In the future, we would like to train our own model to be more accurate, recognize smaller objects and form sentences about what might be in the image. We would also like to speed up the time it takes to make predictions.

Built With

Share this project:

Updates