We identified the intangible void existing between the mute and the rest of the population. Even at our university, we don't find disabled students studying like the rest. Upon researching, we found that there are only 250 certified sign language interpreters in India, translating for a deaf population of between 1.8 million and 7 million. This void can be filled with a simple but useful application that we've created.

What it does

VoiceBox gives voice to the mute by helping them have a one to one conversation with people not knowing the sign language. It's not just an app for the mute but also for people like us who wish to bridge the gap and take a step towards inclusion. Our application allows any user to point the camera towards a mute person (with consent, of course) and effectively understand what he/she is trying to convey.

How we built it

We used a public dataset of American sign language gestures and trained a convolutional neural network using pyTorch. We then used openCV to extract the gestures from the webcam feed and pass to the model to identify the gesture. The gesture is then mapped to the letter to which it corresponds to. Also we are adding a feature of autosuggestions just like on mobile phone so that the person need not make all the hand gestures but only use a few gestures to convey the whole sentence.

Challenges we ran into

We were stuck on deriving inferences from the model for a long time. Even with high accuracy, on supplying input to the model, it wasn't giving us satisfactory results.

Accomplishments that we're proud of

We are very happy that we were successfully able to integrate the model along with the webcam feed to a web application built on flask so that it is easy to access and use.

What we learned

The whole process of building this project was a steep learning curve of a number of achievements and a few disappointments. We thoroughly enjoyed creating it and broadened our PyTorch knowledge along the way.

What's next for VoiceBox

Right now the inference speed is a little low, we know that people who are experts in ASL can make gestures at a very fast rate. We aim to match that level of speed for our inference too so that we can provide a seemless way for the disabled people to communicate easily.

Share this project: