Auto-RetrieverRAG

Inspiration

Higher education and research institutions often struggle with complex, metadata-driven searches that go beyond simple keyword matches. Inspired by the need for smarter, flexible search tools in academic settings, we combined Pinecone's vector search with large language models (LLMs) to create Dynamic RAG Auto-Retrieval. This system is designed to handle complex queries for research papers, datasets, and academic content, providing precise and adaptable results through customizable prompts and dynamic metadata retrieval.

What it does

This project demonstrates how to build an auto-retrieval system tailored for higher education and research institutions, using Pinecone for vector storage and Arize Phoenix for trace visualization. Researchers and academics can query large, semi-structured datasets (such as papers, research metadata, or journal articles) using natural language. The system leverages vector embeddings and dynamic metadata filters to return relevant information while guiding LLMs in interpreting queries accurately. It goes beyond basic top-k search by incorporating metadata-driven retrieval, making it highly suitable for academic and research environments.

How we built it

The project integrates Pinecone for vector storage, Arize Phoenix for monitoring, and LlamaIndex for auto-retrieval. Key steps include:

Setup: Initializing Pinecone and building a vector index to store academic metadata (authors, publication year, topics). Vector Index Auto-Retriever: Creating the VectorIndexAutoRetriever to process user queries and map them to vectorized academic content using metadata filters. Custom Prompts: Customizing the LLM prompts to handle metadata like publication year, author, and research domain accurately. Dynamic Metadata Retrieval: Implementing a system that dynamically retrieves metadata examples to help the LLM infer the correct filters and return accurate, context-driven results.

Challenges we ran into

Handling Metadata Filters: Ensuring that the LLM correctly infers metadata filters like year, theme, and author from user queries, especially when the data requires specific formatting (e.g., capitalized themes).
Query Precision: Balancing the natural language input from users with structured database queries to avoid overloading the prompt and ensure accurate results.
Error Handling: Dealing with empty or failed queries, where the system doesn't return any results due to incorrect metadata inferences.

Accomplishments that we're proud of

Dynamic Metadata Retrieval: Successfully implementing a dynamic retrieval system that enhances query accuracy by feeding relevant metadata into the prompt.
Customizable Prompting: Creating a flexible prompting system that can easily be adjusted to fit different datasets and retrieval needs.
Seamless Integration: Bringing together multiple technologies (Pinecone, Arize Phoenix, and LlamaIndex) to build an advanced retrieval system.

What we learned

Advanced Retrieval Techniques: Learned how to extend basic top-k semantic search into a more sophisticated auto-retrieval system using metadata and prompt customization.
Dynamic Query Structuring: Gained insights into dynamically structuring queries based on user input, leveraging few-shot learning to guide LLMs in retrieving the most relevant data.
Metadata Management: The importance of metadata formatting and ensuring the LLM understands how to apply filters correctly based on the dataset schema.

What's next for Dynamic RAG Auto-retriever with AI

Scalability: Expanding the system to handle larger, more complex datasets, potentially using additional metadata fields or integrating external knowledge bases.
Improved Prompt Customization: Further refining the prompt templates and adding more few-shot examples to improve accuracy across different types of queries.
Real-Time Analytics: Enhancing the system with real-time analytics and feedback loops, allowing it to adapt dynamically based on user interactions and improve over time.

Built With

arize
llamaindex
openai
phoenix
pinecone

Updates

Jeffrey Chu started this project — Oct 13, 2024 05:59 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.