Inspiration

The content with no subtitles on the internet cannot be understood by people with hearing impairments. Furthermore, people trying to learn the language of the content need to interact with subtitles because they may encounter words they don't know, if they can interact with subtitles they can translate words with web extensions easily. And there may be people needs to translated subtitles for that content.

For these three problems, i built a transcription app that transcribes videos and streams them with interactive subtitles.

What it does

Transcription Studio transcribes any video you upload and streams it with interactive subtitles. You can watch the video in your browser with interactive subtitles, download the subtitled video, or download the transcription as an ".srt" file. Additionally, there's a translation feature. Once you select a language and click translate, the transcription, subtitles, and timeline segments are all translated to language you chose accordingly.

How we built it

  • Backend: Python/Flask with Whisper AI for transcription, FFmpeg for video processing, and Google Translate API for multilingual support
  • Frontend: HTML/CSS/JavaScript with real-time subtitle sync and interactive timeline
  • Model: I used Whisper large-v3-turbo model for video transcription. It is a publicly available multilingual speech-to-text model and proved its reliability by transcribing videos with almost perfect accuracy.

Challenges we ran into

  • Real-time subtitle synchronization
  • Efficient translation of long transcriptions

Accomplishments that we're proud of

  • High accurate AI transcription with Whisper
  • Translation option to 20+ languages
  • Completely free and open-source

What we learned

  • AI speech recognition with Whisper
  • Video processing with FFmpeg

What's next for Transcription Studio

Making subtitles more interactive for language learners:

  • Click-to-translate: Instantly translate words by clicking them
  • Vocabulary tracker: Save clicked words for later review
  • Dual subtitles: Show original and translated subtitles together
  • Speaker diarization: Identify different speakers
Share this project:

Updates