Having access to an education is a key role in one's success in life. But even in a country as developed as the US, there are people slipping through the cracks. One such group are certain members of the Deaf community. According to Gallaudet University, deaf and hard of hearing children have a lower literacy rate than their hearing counterparts. Most hearing people learned how to read using phonics, but how does that translate into Deaf education where that does not apply? The solution is combining ASL-fingerspelling with written letters and words to help cultivate a connection between the visual and written languages. Deaf parents with Deaf children already do this instinctively, but what about Deaf children living in a hearing world, with no Deaf adults? All children deserve the ability to read, and technology is a way to ensure that happens.
What it does
Compares user inputted image URL to tagged files of photos trained by the Microsoft Cognitive Services API, returns the calculated tag as its English alphabet counterpart.
How we built it
We first took hundreds and hundreds of pictures, of each letter of the ASL alphabet. We then ran the letters through Microsoft's Cognitive Services API, creating a different tag for 26 letters. We ensured the API would recognize the hand shapes accurately by changing the contrast, background, exposure, lighting, skin color, and placement of the hand in each photo.
We used the WebRTC API to take photos of the letter sign via webcam. Then save the images by converting them to Blob objects. We used a callback function to parse the json file. We obtained a tag ID and the propabibilty that it matches the tag. If its greater than 50% it will be set as that tagged letter. It then outputs the translated character.
We also used the Microsoft Computer Vision API and Node.js to translate a string of ASL images into a English word. This was done by stringing together the image's URLs together which can be obtained when the images are converted into Blob objects. This array would then run through the program in order for the images to be separated and then analyzed using the API. Finally each image would be translated to its corresponding letter, and all the letters would be strung together to make a word.
Challenges we ran into
We originally struggled to have the API recognize the different hand shapes, it was able to recognize the fact it was a photo of a hand, but not the letter. After we met our push goal of having the API recognize the letters of the ASL alphabet accurately, we wanted to host a website that could allow the user to take photos on their webcam for translation, instead of having to upload image URL manually. We also had issues providing enough photos for each letter for the API to be accurate.
Accomplishments that we're proud of
Getting the API to work was a bit of a challenge for us with the number of tags needed, but we still managed to get it. We also got a webcam to work on the website, and even snap a single picture. We are also stoked we provided enough diversity of backgrounds and hands gestures for the API to recognize different hand shapes.
What we learned
None of us had prior experience with node.js, so we learned quite a bit about that. We learned how many photos are needed for an accurate tagging in a Convolutional Neural Network, and that they must be diverse. We learned how to use GitHub with group projects and work on the same files as others, how much easier sharing files via a cloud is, and we learned how to use the terminal to access files.
What's next for test
:Allows the user to take multiple photos on their webcam and have multiple letters translated through the website into full words.