Inspiration

What it does

This project aims to bridge the translation gap for users of ASL and speakers of other languages by simply translating and saying aloud what each symbol of the language means. Hand gestures of the ASL speaker are collected and scrutinized to focus on the symbol represented. After gathering enough information regarding color scheme and contour differences, the image of the gesture is compared to a large database of labelled ASL gestures. Through Convolutional Neural Networks and extensive Machine Learning applications, the gesture is classified to its closest counter part in English. A user choice is then given to read the sentence out loud.

How we built it

The backend core of the project was built in Python using OpenCV. OpenCV was used to gather images of the hand gestures of the user and focus on the hand itself using Computer Vision. The classification of the hand gestures was then done by a Convolutional Neural Network built through Google's Auto ML. The front end of the project was created using HTML, CSS and Javascript along with Bootstrap. Additionally to link the backend with the frontend we used Flask with Python.

Challenges we ran into

During the course of the project we ran into a few challenges :-

  1. Connecting the backend of OpenCV with the frontend web app through Flask. Since we required a web camera to be used in the OpenCV portion of the project, a regular web deployment was not possible. This gave us an opportunity to learn some new skills regarding camera deployment via OpenCV.

  2. Training Accurate Models - We were struggling to find large enough datasets which could be trained in the limited time frame to give accurate classification results for the hand gestures. We had to compromise by limiting the "train" portion of our data set and hence could not achieve the best results as we had hoped.

Accomplishments that we're proud of

  1. Our classification worked relatively well for number gestures. We were able to classify numbers correctly almost 90% of the time.

  2. Our OpenCV code was able to carefully analyze the contours of the hand to build newer layers for our Machine Learning algorithms.

What we learned

What's next for Sign To Sound

As mentioned before, the accuracy of our models had to be compromised to fit the time restrictions of the Hackathon. We intend on finding larger and more robust datasets to help improve our accuracy. Additionally, we would like to allow users of ASL to input entire sentences rather than single alphabets. This would help them communicate at a faster pace and would truly bridge a much needed gap.

Built With

Share this project:

Updates