Inspiration

What it does

How we built it

Challenges we ran into

Accomplishments that we're proud of

What we learned

What's next for SubExtract AI## Inspiration

As video content continues to grow rapidly, the demand for subtitles — both hardcoded (hardsub) and soft subtitles (softsub) — has become increasingly essential for accessibility and localization. However, extracting subtitles from hardsubbed videos is often a manual, time-consuming process or requires expensive tools. We wanted to build a low-cost and intelligent solution to automate this task efficiently.

What it does

Our system processes hardsubbed videos and uses a combination of OCR and AI models to extract subtitles, convert them into editable soft subtitles (e.g., .srt, .vtt), and provide easy integration for content creators or translators. It aims to reduce cost, manual labor, and time consumption.

How we built it

We used Python with OpenCV and Tesseract for OCR, combined with a subtitle segmentation model powered by deep learning. The frontend is built with React for ease of use, and the backend is hosted on a lightweight cloud server (e.g., Render or Vercel) with video processing support. We also integrated FFMPEG for video frame extraction and processing.

Challenges we ran into

  • Achieving high OCR accuracy with diverse font styles and backgrounds.
  • Synchronizing subtitle timing properly.
  • Handling large video file sizes within the cloud processing limits.

What we learned

We learned how to integrate multiple technologies — from video processing, AI, and OCR — into a single streamlined pipeline. We also explored optimization techniques to improve speed and accuracy.

What's next

We plan to:

  • Add multilingual subtitle support.
  • Improve the user interface for subtitle correction.
  • Release the project as an open-source tool for the video editing and accessibility communities.

Built With

Share this project:

Updates