Player with modified play speeds.
Students in the modern day are often prescribed (or resort to) YouTube video lectures to support their learning. However, a time crunch in the hectic everyday student life can lead to randomly skipping through and hoping to retain anything remotely useful from the video.
While randomly skipping through resolves the time crunch, it can result in sub-optimal learning. We wanted to design a solution that analyzes the video and adjusts the playback speed by timestamp ranges, such that the important material is covered in less time.
This solution can extend to various applications including optimizing a busy student’s time, reviewing the highlights of an interesting topic, or speed-listening to audiobooks.
What it does
We developed an algorithm that adjusts a video’s playback speed by timestamp ranges according to the importance of the material in that time.
To do so, we used a YouTube Transcript API to extract captions from the desired YouTube video by speech-to-text transcription or provided by the video uploader. These captions are then fed into Google Cloud’s Natural Language API, which interprets the captions and analyzes the importance of “entities” by assigning a salience score. Higher salience scores imply high importance of the word and therefore more relevant to the viewer.
Playback speeds are then assigned for sections of the video according to the average salience score of sections. Low salience sections are given higher playback speed, and vice versa. We then used an averaging method to smooth transitions between low and high salience sections.
How we built it
We started by finding the Natural Language API on Google Cloud and setting up the python project. We then found a YouTube Transcription API on GitHub to obtain transcripts of videos. The next step was to link the YouTube transcript to the Natural Language processing to generate the complexities/saliences of sentences/entities.
Various algorithms were designed to generate appropriate speeds from salience averages for sentences and caption sections. Based off data, we determined a function to scale playback speed appropriately with salience of entities.
Finally, we developed a website for user input. Flask allows the back-end Python programs we created to interact with the website. It parses the inputted YouTube URL, extracts the captions, and uses the Python programs we created. The website also contains a video player that plays with correct playback speeds for optimal video watching efficiency.
Challenges we ran into
Finding proper API for YouTube transcription - Google’s YouTube Data API v3 was reliable if the captions were uploader-inputted, not for auto-generated captions, so we outsourced to GitHub
Setting up the API getting it to work with Python code - finding accurate sources and understanding documentation, setting up credentials, etc.
Finding a way to accurately compare the complexity/salience of a sentence relative to the overall captions
Creating a function to compare the salience to speed at a reasonable level understandable to humans - low salience, high playback speed; high salience, normal playback speed
Setting up a web app - having python code communicate with Flask backend component
Accomplishments that we’re proud of
Figuring out how to set up permissions for the APIs
Efficient and elegant methods for seeking and adjusting playback speed to adapt to current video time, allowing users to seek specific times and interact with the video player more substantially
Testing, quality assurance, and making the system compliant with good security practices
Coming up with an idea we’re passionate about, an awesome name, and the ~aesthetics~
Writing a super cool and comprehensive Readme and Devpost story
What we learned
APIs are hard but they’re useful for data analysis
Interactions between front-end and back-end web development
Good design gives a product character
What's next for Compress-Hension
The future of Compress-Hension revolves around program improvements and expanding user options. The program can be improved by increasing code efficiency and implementing more fun algorithms by changing up the mathematics.
We want to host the website on the cloud then give users power over playback speeds and provide video analytics that display the relationships between playback speed, salience, and time throughout the inputted video. We want to add support for non-YouTube websites, such as Netflix or Hulu, and develop a way to sustain multiple users without maxing out simultaneous API calls.