Duckslator

About the Project

Inspiration

We wanted to break language barriers and make video content accessible to everyone, regardless of the language they speak. The idea was to create a tool that makes translating videos fast, accurate, and user-friendly.

What It Does

Dusklator translates videos from one language to another, offering voice cloning where the the voice tone and expression in the new video mimic the original video , supporting common languages like Spanish, French, Italian, Russian, and Japanese. It offers high accuracy and low latency, making it perfect for creators and businesses alike.

How We Built It

We used:

OpenAI's Whisper for multilingual speech recognition.
GoogletransAPI for translating text to another language.
The open-source XTTS model by Corqui AI for converting translated text into speech.
Streamlit for the intuitive and user-friendly frontend.
FastAPI to handle the translation logic and AI processing on the backend.

Challenges We Ran Into

Integrating Whisper and XTTS to handle long video files efficiently was tricky, especially ensuring low latency. Creating a seamless flow between the backend and Streamlit while handling diverse language inputs was also challenging.

Accomplishments That We're Proud Of

Building a tool that’s both accurate and fast.
Successfully translating long videos into multiple languages with natural-sounding voices.
Creating a simple yet powerful user experience with Streamlit.