Inspiration

There are so many resources for learning online in this day and age. However, research shows that the best way to learn is through highly personalized, one-on-one education.

With the advent of large language models, everyone from all income levels should be able to have their own tutor. That's why we came up with StudyBuddy!

What it does

Upload a college lecture or educational video and StudyBuddy will watch the whole thing in a few minutes and create a custom study plan for you based on that specific video. This includes generating questions for you to answer in order to check your understanding.

StudyBuddy also has a deep knowledge base of concepts that it can draw from. If you're struggling with a specific area, it can link you to highly-rated Youtube videos that explain that exact concept. It links to a precise timestamp in the video, so you don't have to waste any time scrubbing through the video to find what you want.

You can also talk with StudyBuddy! It has been trained to quiz you on concepts and evaluate your answers. If you get it wrong or leave anything out, StudyBuddy will drill down on specific areas without just giving the answer away. StudyBuddy wants you to learn and grow.

How we built it

We built StudyBuddy entirely in Python, with Streamlit as our frontend, FastAPI as our backend, and Pinecone as our database.

We created a pipeline to download dozens of Youtube videos from respected educational channels and extract their transcripts using the PyTube library and OpenAI's Whisper API. Then we passed the transcripts to GPT-4 to distill the key concepts and their timestamps. For an hour-long video, we could have over 30 key concepts. These key concepts were then turned into embeddings and placed into a Pinecone database. When we're creating a study guide, the model queries this Pinecone database to get relevant key concepts and their timestamped URLs.

Challenges we ran into

We had a lot of difficulty getting GPT-4 to output text in a stable manner. For example, the timestamps GPT-4 generated would occasionally be in a different format than we expected. We dealt with this by carefully crafting prompts to nudge GPT towards the output we desired. We also added additional checks in our code to deal with potentially problematic outputs.

Accomplishments that we're proud of

This was the first time any of us have competed in a hackathon, and we didn't know how much we would get done. We're proud that we built a functional, genuinely useful website within the time limit. We were also all inexperienced with frontend development, but managed to create a pretty decent user experience anyway.

What we learned

We learned a lot about how to work with LLMs and the potential pitfalls with completely relying on their outputs. We also learned a ton about the vast and growing ecosystem of LLM-adjacent tooling.

What's next for StudyBuddy

We'd like to improve the size of our embeddings database and overall make the website faster and more reliable. We'd also love to enhance the intelligence of our interactive chatbot by incorporating frameworks like Langchain.

Built With

Share this project:

Updates