Inspiration
Something we often note at the beginning of the semester, is that Disability Services requires a note taker to help those who may not be able to take notes in class. We realized we also don't take notes in class, sometimes. So, we decided to harness the capabilities of machine learning to generate notes from lecture videos for those who need them.
What it does
We have a Transcribe and Summarize page.
On the Transcribe page, the user may provide a YouTube link or upload a local file. If the YouTube video has captions, we will take those captions. If a video file was uploaded, we use ffmpeg to extract the raw audio from the video, and Google Cloud Speech-to-Text to transcribe the audio. The resulting transcript is presented to the user in the input box of the Summarize page.
On the Summarize page, the user may use the resulting text, or provide their own text, to summarize using OpenAI's GPT-3 Curie model. The model is instructed to summarize the transcript into convenient-to-read notes.
How we built it
We used a React and Express stack for the webapp. Google Cloud's Speech-to-Text and OpenAI's GPT-3 Curie Model APIs were used to handle the machine learning tasks of transcription and summarization, respectively. ffmpeg was used to strip the audio from the video before uploading to Google Cloud.
Challenges we ran into
We had no experience using third party services such as Google Cloud, so getting authentication to work was a learning experience. We are very rusty at software development in general, so it was a great exercise for us. Also, we learned quite a bit about audio encoding in order to strip the audio from the video.
Accomplishments that we're proud of
We're proud that we were able to put it all together! The writer also really likes his Paint-drawn logo :D
What we learned
Integrating APIs sound like a trivia task at first glance, but can actually be very challenging. Also, the capability of today's ML projects are really amazing!
What's next for Lecture Summarizer
Measure accuracy of summarization. If the YouTube link provided didn't have captions, we would make it download the video then run speech-to-text on it. Some better feedback for user interaction (missing a lot of errors, and no progress bar to show users).

Log in or sign up for Devpost to join the conversation.