Babel: Breaking the Language Barrier, One Room at a Time

Inspiration

Language should never determine who gets heard. Walking across NJIT's campus, you see it every day; international students struggling to communicate, immigrant families navigating systems in a language that isn't theirs, people with real talent locked out simply because of the words they use. We wanted to fix that. Not with a clunky app or a phone held awkwardly between two people, but something invisible and human.

What We Learned

Building Babel taught us that real-time translation isn't just a latency problem, it's a meaning problem. A word-for-word swap breaks tone, context, and trust. We learned to frame every Claude API call around preserving intent, not just content. We also learned that the hardest engineering problems aren't always in the code sometimes they're in the room acoustics during a live demo.

How We Built It

Babel runs entirely in the browser - no app, no install friction.

  • Frontend: Vanilla JS + WebSockets for sub-2s room sync
  • Speech: Web Speech API for mic input and TTS output
  • Translation core: Claude API - single prompt handles language detection + translation + tone preservation simultaneously
  • Rooms: Any group joins via a shared code; each participant hears translated audio through their own earbuds

The core translation loop looks roughly like this. For a message of $n$ tokens, we target end-to-end latency $L$ such that:

$$L = t_{\text{STT}} + t_{\text{Claude}} + t_{\text{TTS}} < 2000 \text{ ms}$$

We optimized each stage independently to keep the conversation feeling natural.

Challenges

  • Echo cancellation: Earbuds feeding back into the mic during live demos nearly broke us
  • Latency under load: Multi-person rooms multiplied API calls; we had to parallelize Claude requests per participant stream
  • Tone fidelity: Early prompts returned robotic translations; iterating the system prompt to preserve warmth and register was more art than science
  • OpenCV ASL recognition: We attempted to integrate computer vision-based American Sign Language detection to support deaf and hard-of-hearing users we couldn't get it stable in time, but it's the first thing we're shipping after today.

Built With

Share this project:

Updates