i18n meet

## Inspiration

Our team has 3 Spanish speakers and 1 Brazilian Portuguese speaker. During our first planning call, we realized we were constantly stopping to clarify, repeat, or simplify our words. The conversation felt broken.

That's when it hit us: we're building at a hackathon about AI agents, yet we can't even talk naturally with each other.

Language barriers kill conversations. Interpreters are expensive, translation apps are clunky, and existing solutions break the natural flow of conversation.

So we built the personal agent we needed ourselves.

## What it does

i18n meet is your personal translation agent for video calls. Each participant speaks their native language and hears everyone else in their own language—instantly.

You speak Spanish → Others hear English, Portuguese, Japanese...
They speak Japanese → You hear it in Spanish
Everyone uses their native language. Zero friction.

Key features:

Real-time speech transcription with automatic language detection
AI-powered translation to 10+ languages
Natural voice synthesis with native-sounding voices per language
Floating AI agent panel with meeting actions detection
Live transcript with original + translated text

## How we built it

The Translation Pipeline: Speech → Daily.co + Deepgram STT → Groq Translation → ElevenLabs TTS → Listener

Tech Stack:

Daily.co for video infrastructure with native Deepgram transcription (nova-2 multilingual model)
Groq (llama-3.1-8b-instant) for ultra-fast translation
ElevenLabs Flash v2.5 for natural voice synthesis (~75ms latency) with language-specific voices
OpenAI GPT-5.1 for the AI agent (meeting summaries, action detection)
Next.js 15 + React 19 for the frontend
motion/react for smooth UI animations (draggable/resizable agent panel)
Neon (Postgres) + Drizzle ORM for data persistence
Vercel for deployment

Voice mapping: Each language has a native-sounding ElevenLabs voice:

English: Adam, Spanish: Lily, Portuguese: Freya, French: Charlotte
German: Hannah, Italian: Serena, Japanese: Elli, Korean: Michael

## Challenges we faced

Audio routing complexity: We mute the original remote audio and play translated TTS instead. Managing this without echo or overlap required a queue-based playback system.
Latency optimization: Real-time translation must feel instant. We combined:
- Groq's llama-3.1-8b for fast translation
- ElevenLabs Flash v2.5 with optimizeStreamingLatency=4
- Non-blocking audio queue
Multi-language sync: Each participant needs their personalized audio stream. We handle transcription events per-speaker and route translations to the correct listeners.
Transcription reliability: Daily.co sometimes doesn't provide translations, so we built a Groq fallback that kicks in automatically.

## What we learned

ElevenLabs Flash v2.5 is incredibly fast—75ms latency makes real-time TTS viable
Daily.co's transcription API is powerful but requires careful event handling for edge cases
The "universal translator" from Star Trek is finally possible in 2026
Building with your own pain point makes development 10x more focused

## What's next

Voice cloning so translations sound like the original speaker (ElevenLabs supports this)
Meeting summaries via Resend - automatic email with transcript + action items
Support for 50+ languages - expand beyond the current 10
Mobile apps - React Native with Daily.co SDK
Agent improvements - better action detection, calendar integration

Built with

Next.js, React, TypeScript, Tailwind CSS, Daily.co, Deepgram, Groq, ElevenLabs, OpenAI, Neon, Drizzle ORM, Vercel, motion/react

Try it out links

Built With

drizzle
next.js

Submitted to

Agents Hackathon in Brazil
- Winner Prize by ElevenLabs

Created by

I contributed by developing the agents that generate and execute actions in real time, as well as the AI agent that interacts with participants using the live conversation context.

Antunes
Cristian Correa
Experienced Data & Analytics Engineer passionate about AI, LLM, and Quantum Machine Learning, dedicated to mentoring in tech :)
Railly Hugo
anthony Cueva