Inspiration

The inspiration for this project came from the desire to create an accessible tool for individuals who communicate through American Sign Language (ASL). Sign language is an important communication tool for the deaf and hard-of-hearing communities, but there is a lack of technology that helps bridge communication between sign language users and non-sign language speakers. Our goal was to develop a real-time, user-friendly system that can recognize and translate common ASL gestures into text, improving communication accessibility.

What it does

This project is a real-time American Sign Language (ASL) gesture recognition system. The system uses a webcam to detect and interpret hand gestures, and then translates them into text based on pre-trained models for the first few letters of the ASL alphabet (A, B, C). The system identifies hand landmarks, extracts relevant features, and uses a machine learning model (K-Nearest Neighbors) to classify the gestures. Once the gesture is recognized, the corresponding letter is displayed on the screen.

How we built it

To build this project, we used several key technologies:

Python: The primary programming language used for writing the code. OpenCV: Used for handling the webcam feed and drawing landmarks on the hand. Mediapipe: A Google framework that helps detect hand landmarks in real-time. Scikit-learn: Used for training the machine learning model (K-Nearest Neighbors) to classify hand gestures. Joblib: Used for saving and loading the trained model so that it can be reused without retraining. NumPy: Used for data processing and handling arrays, especially for hand landmarks and features. Pickle: Used to save and load serialized data, including the trained model and features. The workflow involved:

Data Collection: We recorded video data of people performing ASL gestures, focusing on letters A, B, and C. The landmarks (key points on the hand) were extracted and saved. Feature Extraction: We processed the hand landmarks to calculate relevant features (e.g., distances between fingers) and used these features to train the machine learning model. Model Training: We used the K-Nearest Neighbors (KNN) algorithm to classify gestures based on the extracted features. Real-Time Gesture Recognition: Once the model was trained, it was deployed in a real-time system that processed webcam input, detected hand gestures, and predicted the ASL letters being signed.

How we built it

Feature Extraction: One of the major challenges was ensuring that we accurately extracted the correct features from the hand landmarks. Mediapipe provides a set of key points on the hand, and we had to determine which ones were most important for gesture classification. Model Training: Initially, the model wasn’t recognizing gestures effectively. We had to fine-tune the feature extraction and adjust the parameters of the K-Nearest Neighbors (KNN) model to improve accuracy. Real-Time Performance: Processing the webcam feed in real-time and making predictions on the fly was computationally demanding. We had to optimize the system to ensure it ran smoothly without delays. Data Collection: We had limited datasets for each letter, which impacted model accuracy. Collecting more diverse data with varied gestures was time-consuming and required careful manual data recording.

Challenges we ran into

Feature Extraction: One of the major challenges was ensuring that we accurately extracted the correct features from the hand landmarks. Mediapipe provides a set of key points on the hand, and we had to determine which ones were most important for gesture classification. Model Training: Initially, the model wasn’t recognizing gestures effectively. We had to fine-tune the feature extraction and adjust the parameters of the K-Nearest Neighbors (KNN) model to improve accuracy. Real-Time Performance: Processing the webcam feed in real-time and making predictions on the fly was computationally demanding. We had to optimize the system to ensure it ran smoothly without delays. Data Collection: We had limited datasets for each letter, which impacted model accuracy. Collecting more diverse data with varied gestures was time-consuming and required careful manual data recording.

Accomplishments that we're proud of

Real-time Gesture Recognition: Despite challenges with performance and accuracy, we were able to implement a working system that could recognize and translate ASL gestures in real time. Machine Learning Integration: Successfully trained a K-Nearest Neighbors model that could classify the ASL gestures and output the corresponding letter. Feature Extraction: We managed to extract meaningful features from hand landmarks, making it possible for the machine learning model to classify gestures with acceptable accuracy. Accessibility Focus: We created a tool that has the potential to improve communication for the deaf and hard-of-hearing community by breaking down language barriers.

What we learned

Computer Vision and Hand Tracking: We gained a deep understanding of how hand tracking works using the Mediapipe library, and how to process video feeds for detecting key hand landmarks. Machine Learning: We learned how to train and deploy a simple machine learning model (KNN) and how to preprocess data for feature extraction. Real-Time Application Development: Building a system that works in real-time with computer vision requires optimizing both the algorithm and the system to ensure performance doesn’t degrade under load. Data Collection and Annotation: We learned how important it is to have diverse and representative data for training machine learning models. Collecting accurate and varied ASL data is crucial to making the model robust.

What's next for ASL translator

Expanding Gesture Recognition: We plan to expand the recognition beyond just the first three letters (A, B, C) of ASL to cover a larger set of characters or even complete words and phrases. Improved Model: Explore more advanced machine learning algorithms like deep learning (using TensorFlow or PyTorch) to improve accuracy, especially for more complex gestures. Mobile and Web Application: Develop a mobile app or web application to make the ASL translator more accessible to users on smartphones and other devices. Real-Time Communication: Integrate the ASL translator into real-time communication platforms (e.g., video calls) so users can communicate more naturally and seamlessly. Multilingual Sign Language Support: Expand to support other sign languages (e.g., British Sign Language, French Sign Language) by training separate models for each.

Built With

Share this project:

Updates