Communication is one of the most essential human needs yet many people who rely on sign language face daily challenges being understood.
I wanted to create a simple, offline, and accessible system that can translate hand gestures into text using only a webcam.
This idea came from wanting to help the mute and hearing-impaired community communicate more easily without relying on expensive hardware or cloud AI services.
The project was built completely in Python, using:
OpenCV for real-time video capture and on-screen UI
MediaPipe for detecting and tracking 21 hand landmarks
NumPy for mathematical processing and normalization of coordinates
A custom nearest-neighbor classifier (no pre-trained model!) that compares new gestures to recorded reference samples stored in JSON
Challenges I Faced
Model accuracy: Every hand shape varies, so I had to design a normalization system to make recognition consistent.
Lighting conditions: Poor lighting affected MediaPipe’s landmark detection, which required tuning the confidence thresholds.
Performance: Ensuring real-time FPS while processing 21 landmark points per frame took optimization.
UI clarity: Creating an intuitive way for users to record letters and form words took multiple iterations.
Impact
This project shows that with just a webcam and open-source tools, we can empower accessibility — turning signs into speech and gestures into understanding.

Log in or sign up for Devpost to join the conversation.