What does it do?
Our project was made to automatically parse the captions from YouTube videos, and output them as a plain string to be tokenized. After all the words in the one long string are tokenized, we can take a user-input query to search for content inside the captions via the tokenized content. Then, we recommend the video with the most matching tokens between the user search and the tokens from each video parsed.
How did we put it together?
We built it using Python and the YouTube API
What challenges did we run into?
Our challenges were very immense, and they kept coming. For a majority of Saturday, we had a problem with using the YouTube API and OAuth 2.0. We finally learned that, because the API has built-in delete and upload functions alongside the download, we could only download using that part of the API if we had edit permissions. Because of this, we were limited greatly.
What we learned
We learned many, many things from this experience. We learned Cosine Similarity, we learned many different resources to import into python files. We learned more use cases of Regular Expressions. These last 36 hours have been a flood of knowledge.
What's next for YouTube Query Enhancer
Another use case we believed could work would be video recommendations. Using content in the current video to compare to other videos, and show other videos by the closest related video on either a preset list or the website in a whole. This would require more calculations and lots more time.
Log in or sign up for Devpost to join the conversation.