We love anime. And we love Dubbed Anime. So lets bring things together and Dub all anime into these attractive google text-to-speech voices. In all seriousness, it is a way to distribute media to all regions of the world without the hassle of hiring translators and new actors. Consider new production companies trying to expand their media across the world, but not having the funds to get actors to speak the various languages. Using text-to-speech and speech-to-text technology, we can define an inexpensive way to distribute this media across the world.
What it does
This application takes in an mp4 (preferably a TV show), locates the audio in the mp4 and converts the source language to another language using text-to-speech; effectively dubbing the media in another language.
How we built it
To build this we made use of google-cloud text to speech, translate, speech to text and python. We use approximation algorithms to define the location of each sentence said in the audio files and using this location we know where we can put the text-to-speech version of those sentences. Thanks to the great open source software of pydub and moviepy, we had awesome tools to edit audio files and video files.
Challenges we ran into
The biggest challenge was find accurate locations for each sentence being said in the file. Because google-cloud isn't perfect and there were cases of missing words or misinterpreted words, we had to come up with a good approximation algorithm to find locations (in terms of time) of each sentence.
Accomplishments that we're proud of
Finding the location of each sentence to a small level of error was a big accomplishment for us. This isn't just a dub that translates text and plays it back - the alternate language plays perfectly in sync with the original speakers, even when they interrupt each other!
What we learned
How to edit audio files using python and use the google cloud speech api
What's next for SubADubDub
Waiting for google to release more speakers and more languages for speech-to-text. With more of these we can cover a wider range of distribution.