Inspiration
Sign language is a beautiful and expressive form of communication, yet millions of deaf and hearing-impaired people still face barriers when interacting with non-signers. Traditional interpreters are costly, not always available, and cannot be embedded inside everyday digital interactions.
What it does
Helps deaf users communicate naturally using sign language (via video). Helps non-signers understand and respond, using AI translation. Provides ASL animations with accurate finger movements, so hearing users can learn and reply visually. Works entirely inside the browser, with no special hardware.
How we built it
Frontend – Interactive Real-Time UI Used React with TailwindCSS to build an intuitive two-sided chat interface: Deaf users can record or upload a video showing sign gestures. Hearing users can type or speak using voice input. Both sides see clean, timestamped translations. ASL animations are rendered using HTML5 Canvas. Browser APIs play a huge role: MediaRecorder → video & audio capture Web Speech API → speech-to-text SpeechSynthesis → text-to-speech
Backend – AI Processing Engine The backend is a lightweight Node.js + Express service hosted on Cloud Run. Its job is simple but powerful: Accept video/audio/text from frontend Convert video → base64 → Gemini multimodal model Generate text → ASL HTML Canvas animation Return embeddings using generateContent endpoints
AI Layer – Google Gemini 2.0 Two Gemini models are used: gemini-2.0-flash → Video → Text (sign language interpretation) gemini-2.0-pro or Flash → Text → ASL animation code
Deployment – Fully Serverless Using Google Cloud Run, the app scales from zero and avoids maintenance. Frontend runs in a Dockerized Nginx container Backend runs in a Dockerized Node.js container Cloud Run handles HTTPS, load balancing, and scaling IAM configuration ensures backend API is publicly invokable
Log in or sign up for Devpost to join the conversation.