Inspiration

Being students ourselves, we sometimes miss out on vital details during lectures and don't feel confident in what was just taught. This product provides a way to quickly summarize our lectures and test our knowledge with practice problems to boost confidence.

What it does

LectureGPT is a Python application that allows students to upload lectures either via YouTube links or mp3 files and summarize them. It also provides practice problems in concepts they wish to practice.

How we built it

Our Python program was broken down into 4 main tasks:

  1. Getting an mp3 file from the specified Youtube Link: We used the Pytube library to download the mp3 file.

  2. Transcribing that mp3 file into text: We used Whisper through the OpenAI API to transcribe the mp3 into a text file.

  3. Feeding that text into GPT to get a lecture summary and provide notes/practice problems: We used ChatGPT via the OpenAI API to get a summary of the transcription as well as provide practice problems.

  4. Wrapping the functionality up in an aesthetic GUI: We used the CustomTKinter library to create a GUI.

Challenges we ran into

  1. Supporting Panopto Recordings: Originally we wanted to transcribe UMD Panopto recordings but we had trouble scraping the m3u8 request codes programmatically and furthermore converting m3u8s into mp3s. We decided to not include Panopto support for this version.

  2. Whisper's File Size Limit: Whisper has a file size limit of 25 Mega Bytes so we had to use the FFmpeg library to cut the video into smaller chunks and transcribe each separately before combining their results.

  3. ChatGPT Prompt's Character Limit: The ChatGPT API has a character limit of 4096 characters while some of our longer lectures could reach up to 50,000+. Furthermore, ChatGPT can't remember individual blocks of text so we had to summarize each block and then get ChatGPT to summarize those summaries.

  4. OpenAI's Credit Limit For free accounts, OpenAI has a credit limit of $5 which we each quickly burned through. By the end, we had to start paying to keep using their APIs.

Accomplishments that we're proud of

  1. Getting around file size and word limits to summarize hour+ long videos
  2. Properly using version control with Git/GitHub.
  3. Making a clean GUI and webpage for the product.

What we learned

We learned a lot about video streaming and the purpose of m3u8 files and how they differ from regular mp3 or mp4 files. We used many new libraries for the first time including OpenAI, TKinter, Selenium, Pytube, and FFmpeg. Most importantly, we learned how to divide up a programming project among our members and maintain our source code with GIT.

What's next for LectureGPT

  1. Support for Panopto Recordings
  2. Converting the app into a web app
  3. Upgrade GPT to 4
  4. Add support for more systems
  5. Use our own ML algorithms to fine-tune the transcriptions

Built With

Share this project:

Updates