Inspiration

During Zoom calls, deaf and hard-of-hearing individuals often struggle to communicate without relying on typing or interpreters. We wanted to create a real-time, AI-powered solution to bridge this gap.

What It Does

SignSense converts ASL into spoken language using just a webcam. It processes sign language gestures in real time and generates natural speech, enabling seamless communication without additional hardware.

How We Built It

  • Webcam Input – Captures ASL gestures.
  • MediaPipe – Extracts hand, face, and body landmarks.
  • LSTM – Analyzes sequential ASL gestures.
  • LLM & Transformer – Converts gestures into structured speech.
  • React Frontend + AI Backend – Built for real-time interaction.

Challenges We Ran Into

  • Training the model to recognize variations in signing styles.
  • Reducing latency for real-time translation.
  • Ensuring accuracy across different ASL users and environments.

Accomplishments That We're Proud Of

  • Successfully built an end-to-end AI system that translates ASL to speech.
  • Achieved real-time performance with minimal latency.
  • Created an accessible, no-hardware-needed solution for millions.

What We Learned

  • The importance of inclusive design in AI-driven accessibility.
  • Optimizing deep learning models for real-time applications.
  • How multimodal AI (gesture + language processing) improves translation accuracy.

What's Next for SignSense

  • Expanding to support other sign languages (BSL, ISL, etc.).
  • Integrating with Zoom, Microsoft Teams, and other platforms.
  • Improving accuracy with larger datasets and user feedback.
  • Adding speech-to-text for full two-way communication.

https://docs.google.com/presentation/d/1UM2kDEhoAO2jmulpJekIJauD8dl08zJZZ4pdmoQIQ7A/edit?usp=sharing

Built With

Share this project:

Updates