HandsUP was born from a simple but powerful realization — online communication has become an essential part of how we live, learn, and work, yet accessibility within these spaces still lags behind. As video chatting grows through platforms like Zoom, Discord, WhatsApp, and FaceTime, millions of people rely on these tools for connection. But for those who communicate primarily through American Sign Language, the ability to participate fully in these conversations remains limited. Our team wanted to change that. We wanted to create a solution that empowers everyone, regardless of how they communicate, to express their thoughts and ideas naturally in real time.
HandsUP is an American Sign Language-to-Speech model designed to translate ASL gestures instantly into spoken words. It enables individuals who have difficulty speaking verbally to communicate seamlessly through technology. The vision behind HandsUP is to make online spaces more inclusive and to give every person a voice — literally — in conversations that matter. By using a live camera feed, the model detects hand gestures, interprets them as ASL signs, and converts them into clear, human-like speech output, bridging the gap between signers and non-signers in virtual environments.
Creating HandsUP required blending computer vision, machine learning, and speech synthesis into one cohesive system. We developed the model using multiple Python APIs and frameworks, including MediaPipe, TensorFlow, OpenCV (cv2), NumPy, and ElevenLabs. MediaPipe allowed us to capture detailed hand landmarks in real time, TensorFlow powered our deep learning model to recognize and classify signs, and OpenCV handled live video processing. NumPy was used for data manipulation and mathematical computations, while ElevenLabs provided the text-to-speech functionality that transformed recognized signs into natural voice output. Together, these technologies formed the core of our ASL recognition and translation pipeline, enabling near-instantaneous interpretation of hand gestures into spoken words.
Like any innovative project, HandsUP came with its challenges. One of our biggest obstacles was the lack of a structured dataset for ASL recognition. Most publicly available resources consisted of static images or short clips of people signing, rather than large, labeled datasets suitable for machine learning. We had to adapt by extracting and normalizing visual data ourselves, using MediaPipe’s landmark detection to standardize positions and scales for training. We also faced difficulties achieving smooth, low-latency translation during live testing. It required several iterations of model tuning and algorithm optimization to ensure that the system could process frames quickly without sacrificing accuracy. Each challenge forced us to become more creative and technically adaptive, pushing us to find new solutions under tight time constraints.
Despite these hurdles, our team is proud of what we achieved. We built a working ASL-to-Speech prototype capable of real-time translation. We successfully integrated multiple complex frameworks into a single system, proving that a lightweight, accessible ASL translation tool is possible. Most importantly, we created something that has the potential to make a meaningful social impact. HandsUp represents more than just technical progress — it represents inclusion. It allows those who rely on ASL to participate in digital conversations with the same ease and immediacy as anyone else.
Throughout the development process, we learned how critical preprocessing and normalization are for effective machine learning. We saw firsthand how small optimizations in TensorFlow models could dramatically improve performance in real time. Beyond the technical lessons, we also learned about the human side of innovation — how technology can be a bridge for understanding, accessibility, and equality. HandsUp reminded us that artificial intelligence isn’t just about data and algorithms; it’s about empowering people to communicate and connect.
Looking forward, we see HandsUp as the first step in a much larger mission. Our next goals include expanding our dataset to cover a wider range of ASL signs, including dynamic movements and two-hand gestures, and integrating the model directly into major video conferencing platforms such as Zoom and Discord. We also plan to explore bidirectional communication features, including text-to-sign and speech-to-sign translation, so that conversations between signers and non-signers can flow naturally in both directions. We hope to optimize HandsUp further for use on mobile and wearable devices, making it even more accessible and portable.
HandsUp is more than just a hackathon project — it’s a vision for a more inclusive future. It’s a tool built with empathy, creativity, and technology, designed to break down communication barriers and give everyone a voice that can be heard. By merging artificial intelligence with real human need, HandsUp doesn’t just recognize signs — it recognizes people.
Built With
- 11labs
- ai
- cv2
- machine-learning
- mediapipe
- numpy
- python
- tensorflow
- testing
- training
Log in or sign up for Devpost to join the conversation.