Inspiration

Our inspiration came from the core limitations of traditional Retrieval-Augmented Generation (RAG). Standard RAG systems are constrained by a static, single-pass retrieval process. This fundamental flaw means they often fail when handling complex, multi-hop queries that require dynamic, multi-step reasoning and the ability to synthesize information from multiple sources. We saw a need for a system that could overcome this and actively reason and problem-solve, not just retrieve and report.

What it does

Our project, the Context-Driven AgenticRAG System, moves beyond static RAG by creating a dynamic, multi-agent framework. At its core, an intelligent Agent Router analyzes an incoming query and delegates it to the most appropriate specialized agent.

It includes: CorrectiveAgent: Evaluates and refines retrieval quality. It uses web search (via Perplexity and Tavily) to self-correct if the initial documents from the vector store are irrelevant or insufficient, thus reducing hallucination.

PreActAgent (ReActAgent): Handles complex, multi-step reasoning. It's designed to break down large, multi-part queries into a sequence of smaller, executable thoughts and actions.

WorkspaceAgent: Acts as a personal assistant. It securely accesses and utilizes a user's personal, real-time data, such as sending emails or checking calendar events via Google APIs.

Global Memory: A shared context module (using the Memori library) that allows agents to retain information from conversations, providing personalized and context-aware answers over time.

We built the system on a robust tech stack designed for modularity and power.

Frameworks: We used LangChain and LangGraph to architect the core agentic framework. This allowed us to define the agents and the stateful, cyclical data flows between them.

Database & Retrieval: We set up ChromaDB as our vector database. We built a data ingestion pipeline using OpenAI embeddings (text-embedding-3-small) to process source documents and make them searchable.

LLMs & Tools: The agents are powered by OpenAI and Google Gemini models. We integrated external tools to give them real-world capabilities, including Perplexity and Tavily for real-time web search and Google Calendar/Email APIs for the WorkspaceAgent.

Memory: We are integrating the open-source Memori library to manage persistent, global memory for all agents.

Our two biggest challenges were global memory management and evaluation.

Global Memory: An initial attempt at implementing global memory using LangChain's basic in-memory store led to significant issues with data duplication. This not only increased costs by feeding redundant context to the LLMs but also increased the risk of model hallucination. We had to halt that approach and are now pivoting to the more robust Memori library to solve this.

Dataset Creation: We discovered that no existing dataset is designed to test a multi-agent system that also accesses web search and personal user data (emails/calendar). We are now in the process of manually creating a custom dataset by adapting existing RAG datasets (from Hugging Face) and writing our own complex, multi-hop queries to properly evaluate our system's unique capabilities.

Accomplishments that we're proud of

We are particularly proud of completing the entire foundational infrastructure and successfully implementing our first, and most complex, agent: the CorrectiveAgent.

This agent is fully functional and successfully demonstrates the "iterative self-correction" in our project's title. It can retrieve documents from ChromaDB, intelligently evaluate their relevance, and—if the documents are insufficient—it automatically triggers a web search to find better information before synthesizing a final, accurate answer. This proves our core concept of moving beyond a static, single-pass system.

What we learned

This project has been a deep dive into the practical limitations of standard RAG. We learned that true reasoning requires a dynamic, multi-agent architecture where different specialized agents can collaborate.

We also learned that managing state and memory is one of the most critical and difficult parts of building an agentic system. Simply giving an agent a memory isn't enough; the memory must be actively managed to prevent data duplication and context bloat. Finally, we've seen firsthand how vital it is to give agents tools (like web search) to self-correct and ground their responses in real-time information.

What's next for Context-Driven AgenticRAG System w Iterative Self-Correction

Our immediate next steps are to build out the remaining components of our system:

Implement Remaining Agents: Complete the implementation of the WorkspaceAgent (integrating Google APIs) and the ReActAgent (for complex planning).

Build the Router: Create the LLM-based Agent Router that will intelligently analyze and delegate tasks to the correct agent.

Integrate Memory: Finalize the integration of the Memori library for persistent global memory.

Evaluate: Run a rigorous evaluation using our custom-built dataset and cosine similarity metrics to quantify our system's performance against standard RAG.

Looking further ahead, we plan to fine-tune models for domain-specific tasks and add smart browser agents for web automation.

Built With

  • langchain
  • llm
  • memori
Share this project:

Updates