Inspiration
During Zoom calls, deaf and hard-of-hearing individuals often struggle to communicate without relying on typing or interpreters. We wanted to create a real-time, AI-powered solution to bridge this gap.
What It Does
SignSense converts ASL into spoken language using just a webcam. It processes sign language gestures in real time and generates natural speech, enabling seamless communication without additional hardware.
How We Built It
- Webcam Input – Captures ASL gestures.
- MediaPipe – Extracts hand, face, and body landmarks.
- LSTM – Analyzes sequential ASL gestures.
- LLM & Transformer – Converts gestures into structured speech.
- React Frontend + AI Backend – Built for real-time interaction.
Challenges We Ran Into
- Training the model to recognize variations in signing styles.
- Reducing latency for real-time translation.
- Ensuring accuracy across different ASL users and environments.
Accomplishments That We're Proud Of
- Successfully built an end-to-end AI system that translates ASL to speech.
- Achieved real-time performance with minimal latency.
- Created an accessible, no-hardware-needed solution for millions.
What We Learned
- The importance of inclusive design in AI-driven accessibility.
- Optimizing deep learning models for real-time applications.
- How multimodal AI (gesture + language processing) improves translation accuracy.
What's Next for SignSense
- Expanding to support other sign languages (BSL, ISL, etc.).
- Integrating with Zoom, Microsoft Teams, and other platforms.
- Improving accuracy with larger datasets and user feedback.
- Adding speech-to-text for full two-way communication.
https://docs.google.com/presentation/d/1UM2kDEhoAO2jmulpJekIJauD8dl08zJZZ4pdmoQIQ7A/edit?usp=sharing
Built With
- javascript
- lstm
- matplotlib
- mediapipe
- numpy
- ollama
- opencv
- python
- tensorflow
- zonos



Log in or sign up for Devpost to join the conversation.