Inspiration
Our group leader, En Qi, is part of our school's Sign Language Club. When we saw the problem statement about Inspiring Creativity with GenAI, we were immediately interested in making it more inclusive for those who are deaf. Despite videos having captions for the deaf to read, some may prefer viewing the video through sign language instead, while others may only know how to understand sign language and not how to read. That's why we believe that implementing sign language translation on the screen would bring more accessibility options to TikTok.
What It Does
Our current prototype is a desktop website that converts an MP4 video (simulating an actual TikTok reel) into American Sign Language (ASL). ASL is shown through a skeleton avatar. This is done through three simple steps: inputting the video, clicking the convert button, and enjoying the real-time sign language translation of the video. The generated sign language avatar is superimposed onto the uploaded video, and downloaded directly into your local device.
At its peak, we envision TikTok having an option to display a Sign Language avatar at the bottom right (or anywhere according to users' preferences). Almost every country has Deaf TikTok users; it's time to make it more inclusive for them.
How We Built It
We used a React framework and implemented the video-to-text conversion first using ffmpeg and the HuggingFace API for accurate speech recognition. We then converted the text to sign language actions using code references from this GitHub Repository.
Challenges We Ran Into
- Learning Curve: We faced challenges using GenAI, learning TypeScript, and deploying the website to a server. These were new to us, and we had to learn them within a week.
- Deployment Issues: We could not deploy the app due to its large size exceeding free plan limits. Hence, we ensured anyone can run it locally.
- Complexity of Repository: The open-source repository we used was complex. It took us 24 hours of continuous work to find editable parts. Additionally, we had no prior experience with TypeScript.
- UI Design: Due to time constraints, we could not beautify the UI. It is practical, and shows our goal, but is not aesthetically pleasing.
Accomplishments That We're Proud Of
- Real-Time Translation: Successfully implemented real-time sign language translation, providing an immediate and seamless experience for users.
- Full Workflow Implementation: Creating the workflow from file to text to skeleton sign and superimposing it on the original video. This process took much more time and effort than expected, and we're proud of the result.
- Understanding the Target Audience: We researched and understood our target audience, Deaf TikTok users. While working on the speech-to-sign language model, we also identified features like consistent caption placement, sound descriptions, and customizable accessibility features that deaf users would appreciate.
What We Learned
- Technical Skills: How to build a website using React, understand and write TypeScript, and use
ffmpegto convert video to audio. - API Utilization: Using the HuggingFace API for accurate speech recognition and converting audio to text.
What's Next for SPSigns
- Deployment: Successfully deploy the application on a scalable platform.
- Increase Accuracy: Improve the accuracy of speech recognition.
- Expand Word Bank: Increase the word bank and corresponding signs.
- Support Multiple Sign Languages: Expand support to include multiple country sign languages, not just ASL.
Log in or sign up for Devpost to join the conversation.