Inspiration
For the last couple weeks our entire group has spent hours and hours scrubbing through lecture slides and YouTube videos trying to prep for midterms. A lot of the time, these videos are 20-30 mins long, as a lot of the info is useless/repeated several times in the video. We decided that we needed to optimize the studying process, and created a Hack that can do so.
What it does
Athena parses through mp4,mp3 and wav files and created a text transcript of the video. Using that transcript, our team uses a modified version of the SMMRY algorithm (used to power TLDRs on websites like Reddit) that also accounts for correlation to a given topic, to find the n most important sentences in that video.
How we built it
We built this Hack using a number of services. All of the work under the hood in the transcription use Google Cloud technology. In brief, our Summary Algorithm basically assigns a point value to each word (based on how much it appears and how correlated it is with the main subject of the video) and assigns a point value to each sentence (summation of the point values of the words in the sentence.). The NLP to determine how correlated two word are is powered bt NTLK, an NLP library for Python. Our UI is built using Tkinter, a basic GUI library for Python.
Challenges we ran into
1) Optimization: Our algorithm created a correlation matrix for each word, which means that it took forever to run. We had to think through a lot of preprocessing steps (getting rid of useless words, finding words that all mean the same thing) to try and optimize the algorithm as much as possible
2) Front-End: None of us have ever worked with a GUI before so it took us some time to understand what to do
3) Configuring Google Cloud: In order to increase efficiency, our project runs all of its transcription on Google Cloud's server instead of locally. This was a little tedious, and it took time to get it to work
Accomplishments that we're proud of
1) Our TLDR works the way we intended it to! It works almost identically to actual TLDRs on Reddit, and our feature of incorporating correlation to a topic works well
2) We figured out how to configure the transcription to work with mp3 links, meaning that anyone can use this as for any YouTube video. They just need to use YouTube to MP3 and they're set!
What we learned
1) We learned a lot about information theory and how to assign importance to words algorithmically
2) We learned how NLP is used to detect similarity in strings
3) We learned how to use a GUI
What's next for Athena
Although we did optimize our algo a lot, it is still technically O(n^2) - n being the number of words. Our next step is to think of a one-pass solution so it's easier for our users to use. We also want to improve our front end so Athena is a website.
Built With
- google-cloud
- ntlk
- pygame
- python
- tkinter
Log in or sign up for Devpost to join the conversation.