Inspiration
Content creation and social media have always been big things. Usually, videos are only available in the primary language of each respective creator, and we wanted to address this by creating software that can translate any audio from a video into any language that the viewer is familiar with.
What it does
Our code takes a video or audio file from the user then lets the user know what language is detected in the file. It will then prompt the user to pick a language they want the audio to be translated to. Once the language is picked by the user, the program will use the given voice id to determine what AI clone will be used to output the translated audio. Our program also creates a new .wav file containing this translated audio as well as a text file containing the captions in each respective language.
How we built it
We learned how to use cartesia.ai and cloned one of our member's voice. We then followed the steps provided in the cartesia.ai github to install the necessary libraries required for their program. Afterwards, we followed their text to speech file to understand how it works and add onto it using translations and audio detection libraries.
Challenges we ran into
Originally we were only able to successfully translate our audio from English to Spanish. We ran into many errors when attempting to translate to other languages which resulted in audios sounding very demonic and alien-like. We were eventually able to overcome this being able to translate to multiple languages, however we are still unable to translate from foreign languages into English. Another challenge we are still facing is the long run time, taking about a minute to fully execute.
Accomplishments that we're proud of
Some accomplishments that we are proud of include being able to grab and transcribe audio from any given video or audio file, being able to translate audios, and finally being able to translate from English to 13 other languages rather than just to Spanish.
What we learned
We learned how to use cartesia.ai as well as a variety of libraries regarding audio in python. Through this project we also learned how cartesia.ai's text to speech works as well as how to adapt to multiple languages.
What's next for REELate
We would like to be able to support more than just 14 languages as well as be able to translate things from different languages into English. Fixing our slow runtime and eventually being able to implement real time audio dubbing and subtitles are other things we have discussed for REELate's future.
Built With
- cartesian
- python
Log in or sign up for Devpost to join the conversation.