rag-a-muffin

RAG open source containers icon

Inspiration

A long time ago, in a galaxy far, far away, there lived coders who wrote Java in Eclipse IDE. In those days, for a Java Jedi to switch from EJB to Spring was like turning from the dark side to the light. As time passed, the power of the Jedi grew, IDEs evolved, new languages and tools emerged, and Docker became an essential part of the Java Jedi’s force.

Back in 2009, when SVN was the go-to version control system and Maven was the beloved build tool, Docker made containerization so accessible that even young Padawans began using it. The term “microservices” soon became part of the daily vocabulary in all Jedi schools.

Microservices were everywhere— in banking apps, ticketing systems, search engines. Breaking code into functional units became so vital that even my ex wanted more microservices from me than anything else.

Let’s bring it back to the present day. Containerization is now a well-established part of the workflow, and from my perspective, AI still hasn’t quite reached the level of backend technology, though it’s rapidly heading in that direction. We already have frameworks for productization, but they are increasingly being offered as services.

In my professional experience, I’ve often had to create scalable AI solutions, and these solutions are closely tied to backend systems and containerization.

RAG as a system has been described well enough to be delivered as ready-to-use containers in a working environment. Hence the idea of rag-a-muffin—configurable, open-source Docker containers designed for building RAG systems.

What it does

Currently, active development is underway on three Docker containers, with their functionality carefully selected based on real-world experience in building production RAG systems.

As of now, rag-a-muffin consists of:

File Management Container – This container enables file uploads, maintains a log of all uploads, and manages file access permissions. Its purpose is to provide the administration tools for the knowledge base in a RAG system. At present, the container supports multiple vendors: GCP and AWS. Users can either use the container’s preconfigured setup or create their own configuration (specifying where to upload files, which database to log actions to, and how to notify other services about file-related events). In the future, the container will likely evolve to handle all types of knowledge, not just text files as it does now, but also databases, videos, etc.
Indexing Container – This container is designed to convert knowledge into embeddings, but that’s far from its only function. Beyond creating embeddings, it allows for managing vector or graph knowledge, sharding, grouping vector and graph spaces, and providing various search capabilities. The container also comes with ready-to-use configurations, but custom configurations are supported as well. A configuration can literally include everything: the type of vector storage, graph storage, embedding model, data enrichment model, sharding principles, permissions, and much more.
Chat Container – This container enables interaction with knowledge through chat. It’s a WebSocket-based chat implementation that connects to any knowledge base, saves chat history, allows users to specify which knowledge to interact with, separates responses by query type, and offers many other features. The container is also configurable, with both predefined setups and options for custom configurations.

How we built it

We are gathering the material we’ve built across various projects that have made it to production and are working to compile it into an open-source project. Much of the earlier work was done using LangChain, but we encountered several challenges with that tool. As a result, for the open-source version, we plan to use Llama Index instead. Additionally, we have an idea to contribute to the Llama Index repository.

It’s also worth mentioning that we aim to release production-ready containers, which will be available on Docker Hub. These containers can be used both in simple Docker Compose setups and within Kubernetes clusters.

Challenges we ran into

The main challenge is building a truly versatile tool. The task lies in creating systems where every part can be swapped out without altering the final pipeline, and that’s quite difficult. On top of that, there’s a certain weight of responsibility for the quality of the code, the reliability of the system, and the speed at which we can resolve issues. Right now, a small team is developing these containers, and they aren’t production-ready yet, but we’re hopeful that at the upcoming hackathon, we’ll find like-minded individuals to join the effort.

Accomplishments that we're proud of

It’s still too early to talk about any real achievements— the code in the repository is honestly a mess at this stage. We’ve only just started transferring our work over, beginning with this hackathon. Before the hackathon, we built RAG systems for production, but they were heavily tied to specific customers and their unique requirements. Now, we’re working on creating levels of abstraction and tackling the tasks one step at a time.

At the moment, the most usable part is the file management service with an AWS configuration, and we’re working on a GCP configuration as well. The scope of the indexing service is slowly starting to take shape, though the service itself is still in a pretty rough state!

What we learned

Well, to be honest, it’s hard to say we’ve really learned anything new, but we are definitely sharpening our system design skills—that much is true. Besides honing the technical side, we’re also trying to build an open-source project, which brings its own set of challenges: working with the community and moderating pull requests (hopefully, we’ll have some!).

What's next for rag-a-muffin

The main goal is to get these three containers in good shape so that at least the basic tasks of RAG systems are covered and working reliably.

The code in the public GitHub repository is still in pretty rough shape. At the moment, most of the work is being done on two services, which is why there are only two in the repository (files manager and indexing service). Thanks!

Built With

amazon-web-services
asyncio
docker
fastapi
kafka
llama-index

Updates

David Mayboroda posted an update — Oct 13, 2024 06:12 PM EDT

to run containers you need to specify env variables for configuration (AWS). Envs that you need are inside a env.sample file. to start a rumble just run 'docker-compose up --build'

Log in or sign up for Devpost to join the conversation.

Sergey Osipov started this project — Oct 13, 2024 04:36 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.