Inspiration
The need for accessibility in digital communication has never been more critical. With remote work and virtual meetings becoming the norm, individuals who are deaf or hard of hearing face challenges in participating in meetings, especially when using sign language (ASL). Interpreters are often needed to bridge the gap, but finding one can be difficult, expensive, and time-consuming. The idea behind SignSpeak is to create a solution that helps ASL users communicate effectively in Zoom meetings by automatically translating their signs into English and announcing them aloud. By doing this, we can make virtual communication more inclusive for the deaf community.
What it does
SignSpeak is designed to recognize ASL signs and translate them into English, using voice output to announce what has been signed. The current functionality supports the translation of basic ASL alphabets, converting them into spoken English. As the project progresses, more words and full sentences will be recognized. The goal is to eventually incorporate the technology into Zoom meetings to offer a seamless, real-time ASL-to-English translation for participants, breaking down barriers in virtual communication.
How we built it
SignSpeak relies on machine learning models that have been trained using large datasets of ASL images. The model recognizes the signs by analyzing the images captured from a webcam and then maps them to their corresponding English letters. For now, the system only handles basic ASL alphabets but has the potential for more advanced translation as we expand the dataset. We also use TensorFlow for the model’s training and inference, and integrate it with a Zoom widget that would eventually allow real-time translation within meetings.
Challenges we ran into
One of the biggest challenges I faced was training the model with a large set of reference images. Since ASL is a visual language with many gestures that look similar, the model needs a substantial amount of data to differentiate between letters and words. Currently, I am using 100 reference images per letter, but some letters are difficult to distinguish due to their visual similarities. This creates the need for even larger datasets to improve the accuracy of the model.
Additionally, running the model locally means that the system must handle large amounts of storage for image references. As the ASL library expands to cover more words, the storage requirements will increase significantly.
Accomplishments that we're proud of
Despite these challenges, I was able to get the basic alphabet recognition up and running, with the system accurately translating ASL letters and speaking them aloud. The model is currently able to recognize individual letters, which is an exciting first step toward full ASL translation. This initial success shows the potential of the project to grow into a powerful tool for communication in virtual meetings.
What we learned
Here's the revised version:
Through this project, I learned image processing and gained an understanding of how it impacts the training of machine learning models. I also developed skills in working with webcam input and integrating models into applications.
What's next for Speak Sign
While the current version only supports basic ASL alphabet recognition, the next step is to train the model to handle full sentences and a wider vocabulary. This would require collecting more data and refining the model to improve accuracy and differentiation between similar signs. Once the model is robust enough, I plan to implement it into a Zoom widget that would allow for real-time translation during meetings. The ultimate goal is to make SignSpeak a fully functional tool for ASL speakers to communicate seamlessly with hearing participants in virtual environments, making online meetings more inclusive for the deaf community.
As I continue developing the project, I also plan to explore other features such as the ability to recognize and translate common ASL phrases and gestures beyond the alphabet, providing even more context for communication in meetings.
Log in or sign up for Devpost to join the conversation.