TrueSign.AI

Inspiration

Communication is fundamental to truth, dignity, and inclusion. For many Deaf and hard-of-hearing individuals, everyday interactions can become barriers when others do not understand sign language. We were inspired to build a system that uses AI to reduce that barrier in real time.

Rather than replacing human interpreters, our goal is to create an assistive tool that supports spontaneous communication — in classrooms, workplaces, public services, and daily conversations.

What it does

TrueSign.AI translates live American Sign Language (ASL) alphabet gestures into readable text in real time. The system:

Captures live video input
Detects and tracks hand landmarks
Classifies ASL letters using a machine learning model
Displays predicted text instantly
Converts text into speech for hearing users
By turning sign into text and speech, the system acts as a communication bridge between Deaf and hearing individuals.

How we built it

Our system follows a real-time AI pipeline:

Camera → Hand Landmark Detection → Feature Normalization → Classification → Text + Speech Output

We used MediaPipe Hands to extract twenty-one three-dimensional hand landmarks per frame, giving us sixty-three individual data points to work with for each image.

To improve robustness across different users and camera positions, we normalized the landmarks. We did this by calculating the position of each point relative to the wrist, and then scaling those coordinates down so that the distance from the camera or the overall size of the user's hand wouldn't skew the results.

This standardized data was then used to train a Random Forest classifier to predict which letter of the alphabet (from A to Z) the user was signing.

We implemented prediction smoothing to reduce flickering between letters and added a word buffer to construct complete words. Finally, we integrated text-to-speech so recognized words can be spoken aloud.

Challenges we ran into

Motion-based letters such as J and Z require temporal modeling, while our MVP uses frame-based classification.
Prediction instability due to slight hand movements required smoothing techniques.
Lighting and camera variability impacted landmark detection accuracy.
Scope management was critical — full ASL translation includes facial expressions and grammar, so we focused on the alphabet as a scalable proof of concept within 24 hours.

Accomplishments that we're proud of

Building a fully functioning real-time ASL alphabet translator in under 24 hours
Designing a clean, modular AI pipeline from detection to speech output
Implementing feature normalization to significantly improve model stability
Creating a system that aligns strongly with the theme of AI for truth and service

What we learned

The importance of feature engineering in real-time ML systems
The difference between static classification and temporal sequence modeling
How critical scope control is in fast-paced development environments
That impactful AI solutions require both technical rigor and human-centered design