Inspiration
Ava attended ASL bingo, where the balls were called out in sign language rather than verbally. This project was aimed to help improve inclusivity between those who do and don't use ASL, allowing a new means of communication between them.
What it does
Our sign language interpreter recognizes numbers 0-9 and A-Z taking input from the device’s webcam. Upon pressing “start” our model interprets the hand signs shown in the webcam display and says them out loud as well as displaying the user’s input on the screen. We also made a speech to sign interpreter. It can take any voice input, convert them to text and then convert the alphabet characters into the signed alphabet.
How we built it
We used the Google AI API for video input. We took a series of images of ourselves performing the sign language hand signals (around 80 for each sign) and gave it to the Google model to train it. It then gave us the API model which we integrated into an HTML web file.
Challenges we ran into
Our first model attempt utilized a rate limited API that wasn't able to give us sufficient accuracy with the limited training iterations. We then shifted to a Google API - when training the model, we first started with around 10 pictures each, and the model could not detect the difference between the different hand signals very well. The API model had a strong bias towards the number 3. Additionally different backgrounds, clothes, and people would lead to strong biases in other cases. We were able to improve the training data with more quantity and variety to improve the accuracy overall. We were also new to HTML and CSS so formatting the webpage and getting it to our desired functionality was a bit of a challenge, though with guidance from organizers and online resources we were able to get a visual and functional website to display the work we did.
Accomplishments that we're proud of
We are proud of our ability to integrate the API model into the webpage, as well as getting the webpage to output sound. As we previously mentioned, it was a challenge to get the format of the website to a place we wanted. We are proud that we were able to get everything centered and output the signed letters to the screen!
What we learned
We learned that it takes a lot of data to train an AI model. We learned a lot about designing HTML pages and formatting them with CSS.
What's next for Sign Language Interpreter
In the future we would like to be able to train our model with more images for each letter/number as well as expand the vocabulary of our model. We also would maybe like to think about a way to train the model to ignore the background of the webcam input and just focus on the hands, face, and gestures. Sign language also incorporates many gestures and facial expressions, which we have not yet captured with our AI model. In the future we would like to do so. Speech to signs: taking an audio file as input, the Interpreter would output a series of images of finger-spelled signs
Log in or sign up for Devpost to join the conversation.