Inspiration
I was inspired by the need for a more personalized and effective way to study. I wanted to create an interactive tool that allows students to actively engage with their study materials by asking questions and getting instant, contextual answers. By combining the power of LLMs with a simple chat interface, I aimed to build a smart study partner that is always available.
What it does
Study Assistant is a chatbot designed to help you learn from your own documents. You can upload a PDF file containing your study notes, textbooks, or research papers. Then we extract all the text from the pdf so that you can ask questions about the uploaded content. The contents of the pdf then become the knowledge base for the chatbot.
How we built it
I used a combination of Gradio for the front-end UI and PyPDF2 to extract text from uploaded files. The core functionality uses huggingface_hub for the LLM, and the application is all coded in Python.
Challenges we ran into
Having joined only a couple of days ago and not having much prior experience with large language models and their APIs, my biggest challenge was simply understanding how everything worked. Most of my time was spent on learning the basics of the Hugging Face API, how to structure prompts, and what to do with the model's output. I also had to figure out how to manage the input size for the language model. I needed to provide enough context from the PDF for the model to give good answers, but the model has a limited context window. I addressed this by sending a truncated version of the PDF text (the first 2000 characters) to the model.
Accomplishments that we're proud of
I am proud of creating a functional and intuitive application in such a short amount of time. Despite my limited prior knowledge, I successfully integrated several different libraries to work together smoothly.
What we learned
Through this project, I learned a great deal about building interactive machine learning applications. I gained practical experience with Gradio for UI development and learned how to effectively manage application state. I also deepened my understanding of the practicalities of working with large language models, including handling API interactions, managing context, and fine-tuning prompts to get the desired behavior.
What's next for Study Assistant
I want to implement a feature to handle larger PDFs by using chunking and embedding techniques to provide more comprehensive context to the model. I also plan to add support for other file types, such as Word documents or plain text files.
Built With
- huggingface
- python
Log in or sign up for Devpost to join the conversation.