SignSense AI

A real-time, empathetic, and educational bridge between the hearing and Deaf communities

💡 Inspiration

Communication is a fundamental human right, yet spontaneous interactions between hearing individuals and Deaf signers are still blocked by an invisible “silence barrier.” In moments ranging from ordering coffee to medical emergencies, existing tools fail to capture emotion, urgency, and intent—core components of sign language.

We set out to build more than a translator. SignSense AI interprets meaning, including facial expressions, body language, and gesture intensity, while also educating hearing users on how sign language communicates nuance.

🏗️ What We Built

SignSense AI is a bidirectional, real-time communication system that translates between sign language and text, while teaching users why a sign means what it means.

It functions as an empathy bridge, not just a conversion tool.

⚙️ How We Built It

Multimodal Interpretation (Sign → Text)

Powered by Google Gemini 3 Flash
Analyzes video streams for:
- Handshape and motion
- Facial expressions (eyebrows, gaze)
- Gesture speed and amplitude
Detects emotional sentiment and urgency level
Outputs structured JSON → text

This allows the system to distinguish between:

“I need help” vs “I NEED HELP NOW”

Procedural 3D Signing Engine (Text → Sign)

Built with Three.js
Converts typed input into ASL gloss choreography
Drives a fully articulated 3D avatar using:
- Skeletal animation
- Inverse kinematics
- Procedural finger control (15+ joints per hand)

No pre-recorded animations—every sign is generated dynamically.

Joint motion is computed using linear interpolation:

$$ \theta_t = \text{Lerp}(\theta_\text{current}, \theta_\text{target}, \alpha) $$

📚 What We Learned

Translation alone is insufficient—speed, size, and facial expression change meaning
Deaf communication relies heavily on non-manual markers
Embedding education into interaction improves accessibility and understanding

🚧 Challenges

Video Latency: Solved by optimizing Gemini Flash inference and enforcing fast, structured outputs
3D Articulation: Mapping abstract AI instructions to a fully procedural hand rig required intensive math and debugging
Accessibility UX: Balancing fast conversations with deep educational insights

🔁 Key Features

Sign → Text: Video input → text + emotion + urgency
Text → Sign: Input → ASL gloss → real-time 3D signing
Dual Modes:
- Quick Conversation for speed
- Understand & Learn for education and cultural context

🏆 Accomplishments

Built a fully procedural 3D signing avatar
Created a multimodal pipeline that understands emotion and urgency
Enabled real-time, empathetic communication for critical use cases

🚀 What’s Next

Google Veo 3 Integration: Enable more realistic, expressive signing motion for higher-fidelity responses
Smarter Signing AI: Improve text → sign accuracy, including speed, emphasis, and facial markers
Mobile Support: Make SignSense AI fully accessible on smartphones for everyday use
Voice Integration: Add voice option to generate reply * Voice → Sign:
Real-World Testing: Validate performance in real environments like clinics and public services

Built With

2.5
flash
gemini
google
mediarecorder
react
tailwind
three.js
typescript

Updates

Adam Derbel started this project — Feb 09, 2026 06:57 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.