VoiceBridge
Inspiration
Ever tried watching a foreign-language tech talk or joining an international meeting? Language barriers are frustrating. I built VoiceBridge to break those barriers using AI - making global content and conversations accessible to everyone.
What it does
VoiceBridge provides real-time translation with natural AI voice dubbing for:
- YouTube videos: Watch any video with synchronized translated audio (ElevenLabs voices)
- Live meetings: Speak your language, they hear theirs - bidirectional real-time translation
- Smart Q&A: Ask questions about any video using RAG-powered AI (Gemini)
How I built it
- Frontend: Next.js + TypeScript for responsive UI
- Backend: Node.js + Express deployed on Google Cloud Run
- AI Stack:
- Vertex AI for translation
- ElevenLabs for natural voice synthesis
- Gemini for RAG-based Q&A
- Google Speech-to-Text for live transcription
- Features: Gender detection, audio caching, sentence batching for smooth playback
- Database: SQLite for transcript storage and history
Challenges
YouTube's IP blocking: YouTube blocks all cloud provider IPs from fetching transcripts. Solution: Backend fetches transcripts directly using Innertube API when running locally, client-side for production scenarios.
Audio sync: Keeping translated audio perfectly synced with video playback. Solution: Preloading + audio queue management + gender detection upfront.
Real-time latency: Minimizing delay in live translation. Solution: Batching requests, audio preloading, and optimized pipeline (1.5-2s total latency).
What I learned
- Building production-ready real-time systems requires smart caching strategies
- Working around platform restrictions (YouTube's bot detection) needs creative solutions
- AI voice quality matters - gender detection significantly improves user experience
- Deploying to serverless (Cloud Run) requires thinking about cold starts and scaling
What's next
- Mobile app for on-the-go translation
- Zoom/Teams integration for business meetings
- Multi-speaker detection in live conversations
- Dialect-specific voice options
Log in or sign up for Devpost to join the conversation.