Abstract
Deaf is a disability that impairs their hearing and makes them unable to hear, while mute is a disability that makes them unable to speak. Despite their disability, they can still do many other things. If there is a way for normal people and deaf-mute people to communicate, deaf-mute people can easily live like normal people. The only way for them to communicate is through sign language.
One of the solutions to communicate with deaf-mute people is by using the services of a sign language interpreter or they always need to carry notepads when they want to communicate. But the usage of sign language interpreters can be costly. A cheap solution is required so that deaf-mute and normal people can communicate normally.
The breakthrough for this is the Sign Language Recognition System. The system aims to recognize the sign language and translate it to the local language via text or speech. However, when we tried to find datasets for Sign Language Recognition System, we couldn’t find datasets with actual words. We could only find datasets of alphabets of different sign languages.
To have a proper Sign Language Recognition system, we will be collecting data from all around the world in different languages of actual words rather than just the alphabets. This will help us in creating datasets for many languages. These datasets in the future could be used by researchers worldwide to come up with the best Sign Language Recognition System, which may support a cross-language interpretation.
What it does
The system aims to collect video datasets for words of different languages worldwide rather than just the alphabets and digits. This will help us create video datasets for many languages that researchers around the world would use to make various Deep Learning models for converting sign language to local language. While this process of data for the dataset will take a lot of time our app will have a few extra features that will help people with disabilities to communicate. Which are as follows:
- Text to speech conversion using Google CloudTextToSpeech API for communication of the disabled.
- Speech to text conversion using in-built android packages.
- Google Translate for avoiding the language barrier.
- Wavenet audio feature which adds a human touch to voice rather than the robotic.
- Emotion variant for users by adding the pitch and speech rate for voices.
- Save and reuse favorite phrases.
How we built it
The app is built for Android phones only for the time being we will be building a Web app as well soon after launching it on the play store. The storage of video is done on Firebase cloud storage. Firebase is used for everything actually. The user authentication and saving of favorite phrases that are used for cloud text to speech conversion are saved on Firebase's real-time database. We have also used CloudTextToSpeech and Google Translate APIs to give some extra features to the user.
Challenges we ran into
Initially, we were determined to make ML models. Then we ran into the problem of data insufficiency that's when we decided to build a data collection app.
Now while we are building a dataset their will arise a problem of data legitimacy i.e. whether the entered video is the correct sign for a particular word. So we will tackle that with a voting system. The users can vote for the correct video by upvoting and report the video by downvoting. In return for uploading the video and voting the users will get a few reward coins which will be used to avail the extra features of the app (i.e, TextToSpeech conversion, Translate, add to favorite, etc.).
Also Providing TextToSpeech conversion and Translate API requires actual money. SO we have also included a feature to watch ads so that if the user who is not keen on contributing but wants to use these features can freely use them. We have not imposed adverts to pop on every minute. There is a proper Buy/Earn coin section where the user will go and watch ads. This will not lead to the loss of any user on our platform due to unnecessary ads.
Accomplishments that we're proud of
We are proud of taking the first step towards helping people who are deaf/Mute. In our own way, we will be changing many lives. Even though the app won't make drastic changes in their way of living but at least we have made their lives a little bit better. One of our alpha tester who has a hard of hearing problem uses our app to convert speech to text and said that it helped him when he forgot his hearing aid at home once.
What we learned
We have learned about the Software development life cycle. Working with Firebase to help serve our users in real-time. I identifying and solving real-life problem statements.
What's next for BeMyVoice
We will be collecting data for the time being. We will also make a web app for the same using ReactJS and firebase once we officially launch the app on the play store.
We have also planned to create a universal sign language learning portal. Where we will take the highest-rated signed video of a particular word and it will be used as learning material. Furthermore, we will make the dataset once it is completed on Kaggle. SO that other keen ML developers could make proper sign language models freely without going to so much work.
We also plan to build an AR message model in further future. The video data collected can be used to develop an augmented reality 3D cartoon model which can enact sentences in sign languages.
Log in or sign up for Devpost to join the conversation.