Inspiration
Signyfy was an idea developed from necessity and aimed to use emerging technologies to address this necessity. Being individuals who do not know ASL, we would love to communicate with hearing-impaired individuals, and so would the latter party. With the emergence of innovative products like the Ray-ban meta smart glasses, Apple Vision Pro, and Meta Quest 3, we were inspired by the advances in tech.
What it does
The Signify glasses are an innovative product that translates sign language to speech in real-time to a nonsigner. By collecting footage from the embedded camera of the native ASL speaker (hearing-impaired individual), the glasses will firstly convert sign language using gesture recognition based on Mediapipe pose recognition, deep learning models; the text data generated from this process will be converted to speech using a python script that implements Azure text-to-speech services.
How we built it
The team began by assembling and testing each hardware component for compatibility. Once a suitable prototype was established, the development of the software began. The software development process began by drawing out the system design, process gathering, and data flow, the tasks were split between completing the text-to-speech functionality and the hardware and modeling functionality. Using Python as the main programming language, a script was developed to integrate Azure TTS in reading generated text into speech. Mediapipe was utilized for feature extraction and gesture recognition, Libcamera was used in interacting with the embedded cameras, a raspberry-pi 5 was used for edge computing and a hardware component. OpenCV was utilized in computer vision processing and methods, and tensorflow/keras was used to model the machine.
Challenges we ran into
Mediapipe, which we utilized in feature extraction, contained a breaking bug in the API for building on top of their model using the Mediapipe model maker. With our project also being rare and not a lot of material available, the team had to manually train the model, which was time-consuming and less efficient. Thousands of images/videos would have had to be generated to train the model.
Accomplishments that we're proud
Setting up the hardware and getting functionalities to work, building a data pipeline, getting the hand tracking to respond favorably, using Mediapipe to extract data, and integrating text-to-speech.
What we learned
The team learned about hardware and software integration, resilience in debugging issues, getting to train a model manually and data collection, setting up a robust data pipeline, and learning more about a product software development lifecycle.
What's next for Signyfy
Idea validation, access to a larger and diverse data set, more research in finetuning the system, and developing a mobile app that integrates perfectly with the glasses. Funding and building a team to take this form idea/prototype to an actual product.
Built With
- adobe-creative-sdk
- azure
- keras
- libcamera
- mediapipe
- mongodb
- opencv
- python
- raspberry-pi
- tensorflow
Log in or sign up for Devpost to join the conversation.