University Chatbot with admin portal

Student authentication
Admin Portal

Inspiration

This project was inspired by my own university, the sheer amount of documents in the university portal inspired me to build this.

What it does

This Chatbot is a context-aware AI system designed for educational institutions, featuring both an Admin Portal and a Student Portal. The Admin Portal allows administrators to upload, edit, and manage PDF files (e.g., textbooks, lecture notes) and their metadata (departments, semesters, tags) using a Streamlit-based interface, with files securely stored in AWS S3. The Student Portal enables students to log in using authentication, providing personalized context (e.g., department, semester) to tailor their queries. The chatbot leverages Cortex Search to retrieve the most relevant document chunks based on the student's query and context, then uses a Mistral LLM to generate accurate, context-aware responses. This ensures students receive precise answers grounded in the latest educational resources, while administrators maintain and organize the document repository efficiently. Together, the system provides a seamless, secure, and context-aware experience for both administrators and students.

How we built it

Admin Portal:
- Built with Streamlit.
- Upload, edit, and delete PDFs and metadata.
- Metadata includes departments, semesters, and tags.
- Files stored in AWS S3 using boto3.
- Automatically generates and uploads metadata CSV files.
Student Portal:
- Built with Streamlit.
- Student login with authentication using Snowflake.
- Personalized context (department, semester) passed to the chatbot.
- Chat interface for querying and receiving responses.
Context-Aware RAG Chatbot:
- Uses Cortex Search to retrieve relevant document chunks.
- Generates responses using Mistral LLM (e.g., mistral-7b).
- Integrates with Snowflake Cortex for queries and LLM processing.
- Tailors responses based on student context (department, semester).
Authentication:
- Student credentials stored in Snowflake.
- Login managed using Streamlit Session State.
- Ensures secure access and personalized context.
Integration:
- Streamlit for frontend (Admin and Student Portals).
- Snowflake for backend (authentication, Cortex Search, Mistral LLM).
- AWS S3 for file and metadata storage.
- boto3 for S3 interactions.
Deployment:
- Deployed using Streamlit Sharing or Snowflake Native App Framework.
- AWS S3 hosts PDFs and metadata.
- Snowflake hosts backend logic and processing.
Technologies Used:
- Streamlit (frontend).
- Snowflake (backend, authentication, Cortex Search, Mistral LLM).
- AWS S3 (storage).
- boto3 (S3 interaction).
- Snowpark Python (Snowflake integration).

Challenges we ran into

Metadata management, faced difficulties extracting metadata from AWS headers, have to find a workaround through csv files
Automating the file management was very difficult.
Authentication Management:
- Ensuring secure and seamless student login while maintaining personalized context (department, semester).
- Handling session state in Streamlit for persistent user authentication.
Metadata Consistency:
- Managing metadata (departments, semesters, tags) across uploaded files.
- Ensuring metadata updates are reflected accurately in both AWS S3 and the chatbot's retrieval system.
Cortex Search Integration:
- Configuring Cortex Search to retrieve the most relevant document chunks based on student queries and context.
- Optimizing search performance for large datasets.
LLM Response Quality:
- Fine-tuning Mistral LLM (e.g., mistral-7b) to generate accurate and contextually appropriate responses.
- Balancing response generation speed with quality.
File Management in S3:
- Handling large file uploads and deletions efficiently in AWS S3.
- Ensuring metadata CSV files are correctly linked to their corresponding PDFs.
Streamlit Performance:
- Managing real-time updates and interactions in the Streamlit app without performance bottlenecks.
- Ensuring a smooth user experience for both administrators and students.
  ## Accomplishments that we're proud of
Managed to workaround many problems
The fully fledged portal and the automation of file uploads and deletions was rewarding to make ## What we learned
Got familiarized with Snoflake's environment
Learned to use streamlit
Learned SQL programming
Learned how RAGs work ## What's next for University Chatbot
Editing the metadata for files is not working for now
Scaling, with larger pdfs
The LLM can be fine-tuned more

Built With

amazon-web-services
python
snowflake
snowpark
sql
streamlit

Updates

hks3333 S started this project — Jan 21, 2025 07:55 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.