Inspiration
We noticed that most dubbed content still sounds robotic, emotionless, and disconnected from the original performance. Even when the translation is correct, the emotional tone and natural feel disappear. We wanted to fix that. Tonexia was inspired by the idea of true-tone translation — keeping the original speaker’s voice, emotion, and style intact, while delivering the content in any language.
What it does
Tonexia takes any video, extracts the audio and transcripts, and generates a natural, expressive translated voice that perfectly matches the original speaker’s tone. The new audio is then aligned with the video’s timing so the translated speech fits naturally into the original scene. Finally, the system outputs a fully processed, voice-preserved, language-translated video back to the user.
How we built it
Designed the user interface using Base44 (upload + language select + output). Used n8n as our backend automation workflow due to Base44’s paid backend limitations. Extracted text transcripts and audio samples from the input video. Passed this data to an audio generational LLM to generate translated speech in the original actor’s voice. Used another LLM/agent to align the timing of the new audio with the video. Reconstructed and re-uploaded the video into Base44’s output panel.
Challenges we ran into
Base44 backend requires $80/month, so we had to engineer a full backend using n8n. Ensuring the translated voice remains emotional and human, not robotic. Matching timing accuracy so translated dialogues don’t come early or late. Managing long video files without slowing down the workflow. Integrating all components smoothly while keeping the process automated.
Accomplishments that we're proud of
Successfully built a working end-to-end multilingual dubbing system. Achieved voice-preserved translation that sounds expressive and natural. Designed a smooth user flow from upload → translation → output. Integrated multiple tools (Base44 + n8n + LLMs) into one seamless pipeline. Created a solution that can genuinely help break language barriers.
What we learned
How to build full workflows using n8n to replace paid backend services. How audio generational LLMs work with voice samples, transcripts, and frequency patterns. The importance of timing alignment in video dubbing. Frontend–backend integration using Base44 and automation tools. How to manage rendering, generation time, and resource usage efficiently.
What's next for Tonexia : True-Tone Language Translation
Add emotion tuning (e.g., more intense, calmer, energetic). Improve video-audio alignment for even smoother results. Build a mobile app version of Tonexia. Add support for live translation for meetings or streaming. Expand to create a full AI-powered content localization platform
Built With
- base44
- chatgpt
- murf
- n8n
Log in or sign up for Devpost to join the conversation.