Talk in Signs

Inspiration

My teammate has an auditory disability and it didn't stop her from graduating with a college degree and pursuing her career. She has achieved many great things and many more to come, but that doesn't change the fact that her disability exist.

She joined this challenge to make the world a better place for everyone, regardless of their situations, and as a software engineer I am glad to have the opportunity to learn more (Azure Services) and to use my knowledge to try to do something good with what I know.

What it does

Basically it translate what is being said (in Spanish) into a sign language expression. From there, the AI translates the text into a sign language expression for the defined identifiers which will be use as key values for object retrieval from the database.

At this point it is limited to a couple of words and sentences, but a finished product would be capable of inferring countless of sentences and words if, of course, it can be translated to sign. This would also include different models for different languages and their particular sign language.

How we built it

There are two key elements: speech recognition and text translation.

The first one is performed by Azure Speech to Text, which recognizes speech and then transform it into text. The latter by a PyTorch model deployed in a Machine Learning Container. The AI model is an encoder decoder neural network for the translation from Spanish to the basic form of a sign language expression.

The URL's videos are contained in a Azure Cosmos DB with a key identifier that is the word itself (or a identifier for a "compound word" sign). This is done because the videos are open to the public but hosted outside the Azure services space.

The serverless functionality is abstracted by Azure Function Apps which is provided as 2 python functions. One for key retrieval to use the Azure Speech Services, and another for both hiding and accessing the Machine Learning container endpoint and object retrieval of the Azure Cosmos DB as an input binding.

Finally the application is hosted as a Web App thanks to Azure App Services

Challenges we ran into

The main challenge is the AI itself, which was achieved through multiple attempts.

The first attempt was a simple text classification neural network to identify text and classify it, this didn't work since by the point of 20 different works it started to fail in training. With this attempt it also required the recognition of sign language compound words which was solved with another AI model but as the text classification model failed this idea was unsustainable.

Then came the neural network encoder decoder which solved both problems but requires specific sentences to make correct inferences. As so, this particular solution needs a large enough database to cover all possible use cases, but it works nevertheless. This is the current deployment as it is.

Accomplishments that we're proud of

This is a test project, but we are proud that something like this application can be done. With this, it is easy to picture how people with hearing disabilities can be more involved in everyday activities that requires speech or hearing. A "final" version of this application can be used in academics to impart more inclusive classes, participate more in companies or business meetings, or have a more engaging casual conversation.

(Developer) I'm glad that our skills and knowledge can actually be useful to make a positive impact in the world code by code. I'm glad I was able to find motivation to explore Azure services and discover how truly useful it can be and how many amazing things it can accomplish. I'm more excited to see what future we can create with AI technology

(Deaf/Hard of Hearing User) As an Industrial & UX/UI Designer, I am very proud to bring this idea and work together with my teammate Michael. It's a proof that there is still a lot of work to be done on the neglected problems of hard of hearing or deaf people and people with disabilities in general. I believe that design and technology can empower people with unique needs and for a better quality of life. Accessibility should be created for all, not just a handful.

What we learned

We have learned that truly the sky is the limit, everything can be achieved if we put our minds into it. If we have dreams, we must pursue them and never to relent to the hardships of life. We have to live not only for ourselves, but pursue the happiness of others, to make great things for the world.

Technology is supposed to facilitate complex activities, but also make life richer and even easier than it was before. AI is the next big technological breakthrough, if it hasn't already arrived, the uses of AI are limitless with just a tad of creativity and it can make peoples lives better.

What's next for Talk in Signs

This topic was brought up a few times. We were thinking that with more knowledge we could make this application really useful. If we incorporate custom animations we could control its translation speed to a more natural one. This is just the proof that it could be done, but it requires more refinement.

This is a project that could change the lives of millions of people and break down the barrier of hearing disability. It could be use as a way to learn sign language for those looking to learn more, or help incorporate other already open source AI models for sign language to text (and then to speech) for near natural back and forth communication.

Built With

Updates

Michael Perez Abreu started this project — May 16, 2022 01:23 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.