Communication is one of the most essential human needs yet many people who rely on sign language face daily challenges being understood.

I wanted to create a simple, offline, and accessible system that can translate hand gestures into text using only a webcam.

This idea came from wanting to help the mute and hearing-impaired community communicate more easily without relying on expensive hardware or cloud AI services.

The project was built completely in Python, using:

OpenCV for real-time video capture and on-screen UI

MediaPipe for detecting and tracking 21 hand landmarks

NumPy for mathematical processing and normalization of coordinates

A custom nearest-neighbor classifier (no pre-trained model!) that compares new gestures to recorded reference samples stored in JSON

Challenges I Faced

Model accuracy: Every hand shape varies, so I had to design a normalization system to make recognition consistent.

Lighting conditions: Poor lighting affected MediaPipe’s landmark detection, which required tuning the confidence thresholds.

Performance: Ensuring real-time FPS while processing 21 landmark points per frame took optimization.

UI clarity: Creating an intuitive way for users to record letters and form words took multiple iterations.

Impact

This project shows that with just a webcam and open-source tools, we can empower accessibility — turning signs into speech and gestures into understanding.

Built With

+ 20 more
Share this project:

Updates