AI Study Budy

Main Screen

About the Project

AI Study Buddy is a lightweight Retrieval-Augmented Generation (RAG) assistant built with Streamlit.
It allows users to upload TXT or PDF files, index them using embeddings and FAISS, and ask questions about the content.
The app retrieves the most relevant chunks and generates accurate, context-grounded answers using a Groq LLM.
Each AI reply can be exported as a Word document (.docx) for sharing or note-taking. Its a Track 1: AI Visibility & Prompt Discovery project

What Inspired the Project

I wanted a fast, private, and grounded study companion — like asking a knowledgeable colleague about your own documents.
Generic LLM demos often suffer from:

Hallucinations without context
Inconvenient copy/paste workflows
Always-online, centralized data handling

So, I built a simple, local RAG app focused on:

Answers from your uploaded files only
One-click .docx exports
A clean, approachable Streamlit UI

Key Learnings

Retrieval reduces hallucinations and improves factual accuracy.
Sentence-transformers (all-MiniLM-L6-v2) are efficient for short-text semantic search.
Learned how FAISS enables fast similarity search.
Gained experience with Streamlit session state, UI toggles, and file handling.
Discovered PyPDF2’s limitations for scanned PDFs (future OCR integration needed).
Implemented automated Word file export with python-docx.

How It Works (High-Level)

Architecture

Frontend: Streamlit app (chat UI, upload, export)
Ingestion: PyPDF2 for PDFs, text reader for TXT
Chunking: RecursiveCharacterTextSplitter (size=500, overlap=50)
Embeddings: sentence-transformers (HuggingFaceEmbeddings)
Vector store: FAISS (in-memory)
Retrieval: top-k (k=3) chunks → context-constrained Groq LLM query
Export: .docx generation via python-docx + st.download_button

Prompt Design

Prompts explicitly instruct:

“Provide only the final answer. Do not include chain-of-thought or hidden reasoning.”

Post-processing ensures clean, readable responses.

Challenges and Solutions

HTML/JS vs Streamlit components

Problem: Custom HTML buttons couldn’t trigger file pickers reliably.
Solution: Used native Streamlit buttons + session state toggles.

PDF Extraction Quality

Problem: PyPDF2 fails on image-based PDFs.
Solution: Added warning and noted OCR as a future enhancement.

Memory and Persistence

Problem: FAISS index stored only in memory (lost on reload).
Solution: Kept in session for prototype; disk persistence planned.

Reducing Hallucinations

Problem: LLMs can still fabricate info.
Solution: Tight prompt control and retrieval-only context.

Export Fidelity

Problem: Wanted metadata and citations in Word exports.
Solution: Basic export works; structured metadata planned next.

Conclusion

AI Study Buddy proves how a small RAG-based app can deliver accurate, document-aware answers with privacy and ease of use.
It emphasizes practical learning, user control, and a lightweight approach to applied AI.

Built With

ai
deployment-local
huggingface-embeddings-(all-minilm-l6-v2)-apis/llm-groq-(for-fast-llm-processing)-file-handling-pypdf2
langchain-(orchestration)-ai/nlp-sentence-transformers-(hugging-face)
language-python-frameworks-streamlit-(ui)
prompt
python-docx
reimagineweb
streamlit
torch-vector/search-faiss-(in-memory-vector-store)
track1

Submitted to

WebMind Innovation Hackathon

Created by

Team Lead, Developer & AI Expert

Key Contributions:

Ideation & Leadership: Conceived the idea for AI Study Buddy and led the project from concept to deployment under Track 1: AI Visibility & Prompt Discovery.

System Architecture: Designed the complete RAG-based architecture integrating Streamlit, FAISS, HuggingFace Embeddings, and Groq LLM for fast, context-aware responses.

Core Development:

Implemented file ingestion (TXT/PDF) and chunking logic using PyPDF2 and RecursiveCharacterTextSplitter.

Built the embedding pipeline with sentence-transformers (all-MiniLM-L6-v2) and vector search using FAISS.

Integrated Groq API for LLM queries with retrieval-augmented prompts.

AI Engineering:

Optimized retrieval prompts to minimize hallucinations.

Implemented RAG logic for accurate, grounded responses.

Export Automation: Developed automatic .docx export using python-docx for each AI-generated reply.

UI & UX Implementation: Created the full Streamlit interface, handling session state, file uploads, and interactive chat layout.

Technical Documentation: Authored detailed README, architecture overview, and future improvement roadmap.

Mentorship: Guided team workflow, ensured quality integration, and oversaw testing and debugging phases.

Malik Suffian
I actively contributed to the development and presentation of AI Study Buddy. My key roles included:

Designing the concept and defining the main features such as summarizing, assignment generation, and quiz preparation.

Writing and refining the project content, including the tagline, project story, and presentation script.

Collaborating with the team to improve the user interface and make the app more engaging for students.

Pitching the idea effectively to judges by highlighting its educational impact and innovation.

Momina Noor
"QA Engineer, Designer & Data Assistant

Key Contributions:

Quality Assurance:

Performed end-to-end testing of chat functionality, document uploads, and .docx export flow.

Validated retrieval accuracy and cross-checked LLM outputs against document content.

Logged and reported bugs during UI interaction and file handling.

UI/UX Design:

Designed the interface layout, color scheme, and Streamlit styling for a clean and user-friendly appearance.

Suggested improvements to component placement and button flow for better usability.

API Testing:

Tested Groq API responses, verified latency, and ensured consistent prompt-response cycles.

Assisted in debugging API key and model response handling.

Data Handling & Scraping:

Collected and formatted sample PDF/TXT documents used for testing and demonstration.

Helped in preparing structured data for evaluation.

Documentation Support:

Contributed to drafting project write-up and summarizing key learnings and challenges.

Project Assistance:

Provided regular feedback during development, validating improvements in retrieval accuracy and export consistency."

Aamna Sohail

Updates

Malik Suffian started this project — Oct 10, 2025 08:04 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.