Sign 2 Speak

Inspiration

Watching friends struggle to communicate with the deaf community sparked our imagination: enabling everyone to have a voice.

Interpreters cost $150/hour and aren't always available. Existing tech is clunky and slow. We envisioned AI-powered glasses that give sign language users their voice and hearing users visual understanding—breaking down barriers between communities.

What We Built

Sign2Speak features:

Dual-camera ASL recognition with wide-angle capture for comprehensive gesture detection
Cloud-based computer vision pipeline using Gemini's models for real-time sign language interpretation
ElevenLabs speech synthesis for natural voice output from sign language input
AR text overlay for speech-to-text display in user's field of vision
Bidirectional communication flow enabling seamless deaf-hearing conversations

Technical Architecture

Frontend: Smart glasses interface
Computer Vision: Dual-camera setup with cloud-based Gemini model processing
Speech Synthesis: ElevenLabs API for natural voice generation
Speech Recognition: Built-in microphone with real-time transcription
AR Display: Text overlay rendering for speech-to-text output

What We Learned

LLM sign language recognition: Modern LLMs accurately understand sign language with dual-camera input, enabling precise gesture interpretation
Real-time video processing: While latency remains challenging, recent real-time video APIs show promising sub-200ms processing feasibility
Training data limitations: Limited sign language datasets require significant additional data collection for scalability
Hardware constraints: Balancing processing power, battery life, and wearable form factors

Challenges We Overcame

Real-time processing limitations: Engineered custom pipeline using Gemini's vision models with optimized preprocessing
Dual-camera synchronization: Implemented frame-perfect alignment for accurate 3D gesture reconstruction—single cameras provided poor results due to limited field of view
Context preservation: Developed conversation state management to maintain dialogue flow and reduce interpretation errors

Impact & Future Vision

Sign2Speak addresses a $40+ billion assistive technology market while fostering inclusive communication. Our roadmap includes expanding to international sign languages, improving offline processing, and ensuring real-time translations.

We believe everyone deserves to have a voice and that's our ultimate mission.