UWBHacks2022
Adam Deehring, Chloe Tang, Jaylon Nelson-Sellers, Yash Varde
Goals of the project: Improving Education using the Cloud
The idea of our project is to improve the student retention of ideas and concepts presented in class, by using repetition and additional resources based on the contents of the lecture.
Desired user experience
The users of our program are twofold. The first set of users are the Professors, who only need to feed the program the subtitles for their lecture and the give weights to the specific keywords that were mentioned during a given lecture. After this has been done, our program will search for additional resources based on the keywords, with a focus on the words the Professor deemed to be most important in this lecture. The second users of our program will be the students. These students are intended to use the output of the program, a text file with the keywords and the additional resources our program found from those keywords, to follow up on specific terms used during the lecture.
Implementation details
We decided to program our program in Java as we all had the most experience in that language. The .vtt file was the chosen starting file format as it is the file format that Zoom outputs from transcripts from a given recording, and because we wanted to improve retention, Zoom became a good target for help with education. This is accomplished by performing a statistical analysis of the transcript, and reporting back the keywords as determined by our algorithm. Our program takes in a .vtt file, and removes the commonly used words as well as filler words, so that the words that are left are less common but more important to the lecture. We run these specific keywords through an API, SerpApi, to receive additional information on those, so that if a given student requires more resource on a given term, they can easily follow up.
ISSUES AND BUGS
One of our initial issues was importing the appropriate APIs. This was addressed by running through different APIs and understanding which ones simplified the task the most Another major issue was identifying common words that can be skipped. Even though we found a text file containing the 1000 most common English words, testing it out on the Avengers: Endgame transcript resulted in additional undesired common words that were not part of the 1000 in the document. To resolve this issue, we added many of these words from the transcript into the text file and took out the ones that are not very commonly used (like eigenvalue, spectral, quantum, etc.)
FUTURE WORK
Create and develop front-end so that the user can interact with a webpage. Add a keyword searcher similar to Ctrl + F on Windows but will highlight all specified keywords/keyphrases rather than just one. This can help the user narrow down locations within the transcript that contain both keywords/keyphrases and thus learn how both are related to one another. Extend our program to be compatible with other file types over and above vtt (e.g. SRT).
Log in or sign up for Devpost to join the conversation.