Inspiration
What it does
How we built it
Challenges we ran into
Accomplishments that we're proud of
What we learned
What's next for SubExtract AI## Inspiration
As video content continues to grow rapidly, the demand for subtitles — both hardcoded (hardsub) and soft subtitles (softsub) — has become increasingly essential for accessibility and localization. However, extracting subtitles from hardsubbed videos is often a manual, time-consuming process or requires expensive tools. We wanted to build a low-cost and intelligent solution to automate this task efficiently.
What it does
Our system processes hardsubbed videos and uses a combination of OCR and AI models to extract subtitles, convert them into editable soft subtitles (e.g., .srt, .vtt), and provide easy integration for content creators or translators. It aims to reduce cost, manual labor, and time consumption.
How we built it
We used Python with OpenCV and Tesseract for OCR, combined with a subtitle segmentation model powered by deep learning. The frontend is built with React for ease of use, and the backend is hosted on a lightweight cloud server (e.g., Render or Vercel) with video processing support. We also integrated FFMPEG for video frame extraction and processing.
Challenges we ran into
- Achieving high OCR accuracy with diverse font styles and backgrounds.
- Synchronizing subtitle timing properly.
- Handling large video file sizes within the cloud processing limits.
What we learned
We learned how to integrate multiple technologies — from video processing, AI, and OCR — into a single streamlined pipeline. We also explored optimization techniques to improve speed and accuracy.
What's next
We plan to:
- Add multilingual subtitle support.
- Improve the user interface for subtitle correction.
- Release the project as an open-source tool for the video editing and accessibility communities.
Log in or sign up for Devpost to join the conversation.