There are thousands of deaf-mute Singaporeans and millions across the globe. We wanted them to be able to express themselves and have a voice while being able to converse in their language: American Sign Language or ASL. If a deaf-mute person wishes to express their opinion at a public event but there is no translator, we want to give them that chance of being able to have an opinion and being able to express it.
What it does
Our hack has trained a data-set that has images corresponding to every sign of ASL, takes the user's gestures through a webcam and can test it against the train data to predict the sentence just expressed by the deaf-mute user. The user's expression is then displayed for a second user or for larger audience to view.
How I built it
We used tensor-flow-keras for the image processing to predict the categories of the user's gestures. We used opencv and python to access the webcam and html and flask to combine these components and display the final predicted sentence to the users.
Challenges I ran into
We had to choose a training model that gave us the highest possible accuracy on the test data and was also able to predict majority of the signs into their categories. Since there are not many data-sets having information on ASL and also had images of suitable features(like background and image quality), we had to go with the best possible data-sets among those available. We found it extremely difficult to get the same quality of prediction when using the pictures taken by the webcam since 1)due to the resizing of the images we lost a of the valuable data and 2)there was also a lot of noise from the background that caused the image to be predicted different from the correct category. We needed a secure host to be able to use the webcam so the local host opened by the html did not allow us access the webcam but in order to get our idea working we had to work around it. Since we are unfamiliar with flask, it took us a while to combine the three components together via flask.
Accomplishments that I'm proud of
What I learned
We learnt about image processing and making predictions using machine learning on dynamically obtained image files. It also gave us a chance to explore opencv, html and flask.
What's next for ASLAssist
We want to use a much larger data-set (which can recognize the user's gestures against any random background) for training so that we can predict the user's gestures with a higher accuracy. We also want be able to create a two way software that allows two people to converse such that- the user can use ASL to express themselves that is then converted in text to the other user (other person in the conversation) and their reply is converted back to ASL for the deaf-mute person to understand too.