Text based search in a document has been simplified with CTRL+F on modern web browsers. We were wondering: why don't videos on the web have the same feature? Whether it's to find a certain scene in a movie, or to find the start of a specific discussion in a historical documentary, we wanted to reduce the time it takes to find relevant parts of a video.

YouTubeFind is CTRL+F for videos. It's a web application that lets anyone search for a scene or dialogue within the YouTube video(s). On our website,, anyone can search within (inside) a YouTube video that includes closed captions.

What it does

YouTubeFind is able to capture the transcript of a video and allows for anyone to search for a scene, dialogue, or text. In addition to generating the search capability, YouTubeFind generates analytics for content creators ("YouTubers") who are interested in learning more about their audience. Key metrics such as subscriber searching rates and habits and the frequency of YouTubeFind being used on a video is of the upmost importance to serious content creators.

How we built it

YouTubeFind was built using vanilla Javascript along with a variety of other tools -- such as Firebase, Postman, Materialize, and jQuery. The application consists of a frontend and backend. The backend consists of a Firebase instance that runs analytics over frontend inputted data (video links and search terms). The public-facing frontend accepts YouTube links and executes our searching technique.

Challenges we ran into

The YouTube Data API (V3) is not effective, in that common tasks such as downloading captions or working with basic data queries can become tricky really fast. The workarounds that we created to get data from the API were unfortunately unsuccessful, but taught us an immense amount about security in web APIs. Our attempts included tricky and arguably complex reverse engineering techniques that taught us more than we could ask for to learn about web APIs.

The other core challenge to our web application was the creation of a fast lookup data structure that maps phrases and words in the videos to their respective timestamps (i.e. when they occurred). This task was possible due to the very helpful XML (document) received from GET requests made to functional and participating YouTube videos.

Accomplishments that we're proud of

Learning how to strenuously test APIs, building efficient data structures, and using Javascript for embedded video.

What we learned

Web API security, using Javascript for the web, building a database with Firebase, and handling extreme edge cases.

What's next for YoutubeFind

Reach out to a YouTube engineer to learn how to successfully simulate a fake session of watching a YouTube video (and receiving a proper XML response).

Built With

Share this project: