The inspiration for our project came from YouTube, Coursera, and Lynda.com, that generate subtitles for every video posted on their platform. This enables people from different parts of the world to watch and understand the content without having to be affected by the accent variation. In a diverse environment such as USC, such a subtitle feature will be extremely useful for all users.
What it does
Our web-app is an online media player that can play the different DEN videos. It provides options such as Turn subtitles on/off, fullscreen mode, and suggestions for other videos to watch. The UI is cleaner than the official DEN website too.
How we built it
The application was built by first collecting the video from the DEN website. The audio from these videos was then extracted, cleaned, and processed using the GCP Speech-to-text API. Now, that we had the transcription for the video, we used it to generate a subtitles file. Next, an entire web application was built using HTML5 and Bootstrap, to display the video content, subtitles, new suggested videos, etc. It is hosted on an Apache server.
Challenges we ran into
The main challenges we ran into were:
- Extraction of DEN videos: The videos are not downloadable, so we had to screen-grab them.
- Python2 library migration. Lots of common libraries were not available on Python3.
- Pre-processing steps- Cleaning the audio to smoothen out disturbances, so as to maximize the efficiency of speech detection, was the hardest part.
- Getting the web-app up in time, with the functional backend, was definitely a challenge with respect to time available.
Accomplishments that we're proud of
Getting the entire application up in a working condition, given the time constraints, is an accomplishment we are all proud of.
What we learned
- Use of GCP APIs and the powerful platform it provides
- Nuances of audio processing.
- Embedding media capabilities to web apps
What's next for DEN++
We have a few things planned -
- Subtitle translation.
- Putting the entire app online - by hosting videos on servers, etc.
- Improving pre-processing steps and transcription process by parallel processing## Inspiration