Talkless

char-level real time translation
ASL game
word-level real time translation

Inspiration

People with hearing disabilities sometimes have difficulty communicating ideas, especially since not many people speak sign language. Our goal is to fix this problem by developing an app that translates sign language to text using both vision models and language models to make this communication happen in real time.
To make learning inclusive and fun, we also added a mini-game that helps users practice and learn ASL interactively.

What it does

The app uses the camera to detect sign gestures and displays real-time subtitles. It supports:

Character-level translation:
- MediaPipe landmark extraction
- Custom dataset + classifier based on Random Forest Model
- LLM-powered autocomplete for fast, predictive text
Word-level translation:
- Pretrained I3D (dataset: WLASL-2000)
- Fine-tuned T5 (dataset: ASLG-PC12) to translate glosses (e.g., “NAME WHAT”) → natural English (“What’s your name?”)
Mini-Game Mode
- ASL learning game: users must perform correct characters to progress

How we built it

Frontend: Expo
Backend: Modal for scalable model inference
Vision Models:
- MediaPipe landmarks + custom character classifier
- I3D for word-level classification, which has been trained on 26,027 WLASL word videos
Language Models:
- T5 fine-tuned on ASLG-PC12
- Gemini API for real-time smart suggestions

Challenges we ran into

Expo has some issues with WebSocket on phone
Modal has some initial latency that causes lags when users query the model for the first time
In the word-level translation, it has lower accuracy compared to the character level because it has to classify the words in the 2,000 WLASL dictionary.

Accomplishments that we're proud of

Dual-level gesture recognition (char + word)
Fine-tuned T5 for ASL gloss-to-English
Real-time interactive app with:
- Subtitle overlay
- ASL mini-game
- LLM-powered suggestions