RAG 'n' ROLL: AI-Powered Research Assistant

Inspiration

The overwhelming volume of research papers and academic articles inspired us to create a tool that simplifies the research process. We wanted to combine cutting-edge AI technologies with user-friendly interfaces to make academic exploration more efficient and insightful. The goal was to reduce the time researchers spend sifting through papers and help them focus on generating impactful ideas.

What it does

RAG 'n' ROLL is an AI-powered research assistant that:

Analyzes Research Papers: Uploads and processes PDF or text documents to extract key insights, including research questions, claims, and evidence.
Performs Intelligent Searches: Enables users to find relevant papers using advanced embeddings and cosine similarity.
Generates Summaries: Leverages Mistral LLM to produce concise summaries of complex documents.
Evaluates Search Performance: Provides relevance feedback with TruLens to continuously improve search accuracy.

How we built it

We utilized a combination of powerful tools and technologies:

Frontend: Built using Streamlit for a responsive and intuitive interface.
Backend: Managed data with Snowflake and implemented Cortex Search for retrieval-augmented generation (RAG).
Language Model: Integrated Mistral LLM for summarization and insight generation.
Search Optimization: Used TruLens to measure and enhance search performance.
Libraries: Leveraged PyMuPDF for PDF text extraction and Sentence Transformers for embedding generation.
API Integration: Connected to external APIs like Julep for document analysis tasks.

Challenges we ran into

API Integration: Managing multiple APIs, including Julep, Mistral, and Snowflake, was complex and required robust error handling.
Data Embeddings: Ensuring the accuracy and efficiency of embeddings for large documents presented technical challenges.
Search Optimization: Fine-tuning the search results for relevance required significant testing and iteration.
Deployment: Deploying an AI-powered application with multiple moving parts to a live environment while maintaining performance was challenging.

Accomplishments that we're proud of

Successfully built a fully functional research assistant that integrates multiple cutting-edge technologies.
Achieved smooth interaction between Cortex Search, Mistral LLM, and Streamlit for a seamless user experience.
Implemented relevance feedback using TruLens to improve search accuracy and demonstrate measurable performance gains.
Deployed the application live for users to explore and benefit from.

What we learned

Collaboration Across Tools: We gained valuable experience integrating diverse tools like Snowflake, Mistral, and Julep into a cohesive system.
Search Optimization: Learned the importance of relevance metrics and iterative improvement for search accuracy.
Frontend-Backend Synergy: Streamlit proved to be an excellent choice for building a user-friendly front end for a sophisticated backend.
Scaling AI Applications: Understood the complexities of deploying AI applications in a real-world environment.

What's next for RAG 'n' ROLL: AI-Powered Research Assistant

Enhanced Visualizations: Add more interactive data visualizations to enrich the user experience.
Broader Document Support: Expand support for additional document types and formats.
Improved Performance: Optimize the backend to handle larger datasets and scale for enterprise use.
Advanced Features: Incorporate features like topic modeling, trend analysis, and citation suggestions.
Fine-Tuning LLM: Experiment with fine-tuning Mistral LLM for domain-specific research tasks.
Community Collaboration: Engage the academic and developer communities for feedback and future collaboration.

Built With

cortex
julep
mistral
pymupdf
python
snowflake
streamlit

Updates

Soulemane Sow started this project — Jan 21, 2025 08:02 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.