Inspiration
Communication is fundamental to truth, dignity, and inclusion. For many Deaf and hard-of-hearing individuals, everyday interactions can become barriers when others do not understand sign language. We were inspired to build a system that uses AI to reduce that barrier in real time.
Rather than replacing human interpreters, our goal is to create an assistive tool that supports spontaneous communication — in classrooms, workplaces, public services, and daily conversations.
What it does
TrueSign.AI translates live American Sign Language (ASL) alphabet gestures into readable text in real time. The system:
- Captures live video input
- Detects and tracks hand landmarks
- Classifies ASL letters using a machine learning model
- Displays predicted text instantly
- Converts text into speech for hearing users
- By turning sign into text and speech, the system acts as a communication bridge between Deaf and hearing individuals.
How we built it
Our system follows a real-time AI pipeline:
Camera → Hand Landmark Detection → Feature Normalization → Classification → Text + Speech Output
We used MediaPipe Hands to extract twenty-one three-dimensional hand landmarks per frame, giving us sixty-three individual data points to work with for each image.
To improve robustness across different users and camera positions, we normalized the landmarks. We did this by calculating the position of each point relative to the wrist, and then scaling those coordinates down so that the distance from the camera or the overall size of the user's hand wouldn't skew the results.
This standardized data was then used to train a Random Forest classifier to predict which letter of the alphabet (from A to Z) the user was signing.
We implemented prediction smoothing to reduce flickering between letters and added a word buffer to construct complete words. Finally, we integrated text-to-speech so recognized words can be spoken aloud.
Challenges we ran into
- Motion-based letters such as J and Z require temporal modeling, while our MVP uses frame-based classification.
- Prediction instability due to slight hand movements required smoothing techniques.
- Lighting and camera variability impacted landmark detection accuracy.
- Scope management was critical — full ASL translation includes facial expressions and grammar, so we focused on the alphabet as a scalable proof of concept within 24 hours.
Accomplishments that we're proud of
- Building a fully functioning real-time ASL alphabet translator in under 24 hours
- Designing a clean, modular AI pipeline from detection to speech output
- Implementing feature normalization to significantly improve model stability
- Creating a system that aligns strongly with the theme of AI for truth and service
What we learned
- The importance of feature engineering in real-time ML systems
- The difference between static classification and temporal sequence modeling
- How critical scope control is in fast-paced development environments
- That impactful AI solutions require both technical rigor and human-centered design
What's next for TrueSign.AI
- Add word- and phrase-level recognition using temporal sequence models
- Expand beyond the alphabet to dynamic signs
- Deploy as a mobile or web-based application
- Integrate into educational and healthcare environments
- Improve multilingual speech output and personalization
Built With
- css
- elevenlabs
- flask
- html
- javascript
- numpy
- opencv
- pandas
- pickle
- python
- scikit-learn
Log in or sign up for Devpost to join the conversation.