If you're like me, you enjoy learning, but find it difficult to sit still and focus for the entirety of a lecture. For students with learning differences such as attention deficit disorder and ADHD, sitting and focusing on one topic for two hours is a great challenge, causing students to get distracted, uninterested, and fall behind in class. We propose automatically creating lecture summaries from long videos, that condenses the information of the two hour lecture into a 5 to 10 minute video.
What it does
CondenseMyTalk makes use of two main methods to determine the "importance" of a part of a video: User comments, and summaries created by a transcript of the video. Users can upvote and downvote parts of the video, along with appending supplementary information to the video. CondenseMyTalk records at what part of the video you upvote or downvote, and considers that to be a "region of interest" - if the part of the video was upvoted, that part is considered more important. If that part of the video is downvoted, that part is considered less important. CondenseMyTalk also makes use of the Google Cloud Speech-to-text API to create a full transcript of the video, which is fed into a text summary creator, and the parts of the video that are included in that summary are also considered important. Lastly, parts of the video are added to the summary in order of importance to create a lecture summary video.
Accomplishments that I'm proud of
I have never made extensive use of the Google Cloud API's to create a project. I am impressed with how effective integration is into what I was trying to do, including uploading to google cloud storage and transcribing large audio files.
What I learned
I learned a lot about text summary techniques, snipping together videos during runtime in python, and making use of the Google Cloud speech-to-text API.
What's next for CondenseMyTalk
For future work I would like to incorporate optical character recognition that can analyze the importance of a certain frame and take that into account in the importance of the video. Additionally, the two aspects of the program that take the largest amount of compute time is creating a transcript - which is a necessary evil - and compiling the condensed video together. Possibly, instead of creating an entirely new video, to scrub through the existing video as the video continues.