🌌 MemoraAI — The Memory Layer for Intelligent Collaboration
“I developed MemoraAI to fix the one thing that limits every AI - increased response latency with increased conversation size ”
💡 Problem Statement
While working on long-term projects using tools like ChatGPT and other AI assistants, I noticed something frustrating — as the conversation grows, the performance and response speed start to drop.
The model becomes slower, contextually confused, and eventually, even branching to a new chat doesn’t help because the entire historical context is lost.
It felt like every time I made progress, I had to start over. Refeeding the same data into a new session was time-consuming and inefficient — especially for developers, researchers, and creators who depend on AI continuity.
That’s when I asked myself:
“What if AI could remember — just like we do?”
So I built MemoraAI — a system that lets AI retain knowledge, recall relevant context instantly, and stay consistent across sessions.
🚀 The Vision
MemoraAI is designed to preserve and retrieve memory context dynamically for large language models.
Instead of sending the entire chat history each time, MemoraAI uses a semantic memory architecture — enabling faster, cheaper, and smarter continuity for AI-driven workflows.
It’s not a chatbot. It’s a memory infrastructure for intelligent systems.
⚙️ How I Built It
MemoraAI combines Vertex AI, Elasticsearch, and FastAPI to simulate human-like long-term memory for AI models.
🧩 Architecture Overview
User → Streamlit Interface → FastAPI API → Vertex AI + Elasticsearch
- Ingestion: Every interaction is split into chunks, embedded using Google’s
textembedding-gecko@001, and stored as a dense vector in Elasticsearch. - Retrieval: On every new query, MemoraAI performs a hybrid search — combining lexical match and semantic similarity.
- MMR Reranking: I implemented Maximal Marginal Relevance (MMR) to select the most relevant and diverse memories.
- Generation: The selected context is merged into a system prompt and passed to Vertex Gemini for reasoning and response generation.
- Persistence: Even after sessions end, memory remains — ensuring project continuity across time.
🧮 The Math Behind MemoraAI
The MMR (Maximal Marginal Relevance) approach ensures context is both relevant and diverse:
[ \text{MMR} = \arg\max_{D_i \in C} [ \lambda \cdot \text{sim}(Q, D_i) - (1 - \lambda) \cdot \max_{D_j \in S} \text{sim}(D_i, D_j) ] ]
where:
- ( Q ): Query vector
- ( D_i ): Candidate document
- ( S ): Already selected context
- ( \lambda ): Relevance-diversity tradeoff
This ensures MemoraAI recalls the right context at the right time — just like selective human memory.
🔬 What I Learned
Building MemoraAI taught me the art of balancing intelligence and performance.
Some lessons include:
- Memory retrieval matters more than storage — it’s about understanding context relevance.
- Vertex AI permissions and IAM roles are powerful but require careful setup for secure access.
- Hybrid search (semantic + lexical) dramatically improves recall quality.
- Prompt design can transform how an AI uses memory.
⚔️ Challenges I Faced
- Vertex AI Access: Configuring the right service accounts and model permissions was complex.
- Elastic Vector Mapping: The 768-dimensional embeddings had to be correctly defined for cosine similarity.
- Cloud Run Deployment: Containers failed until I learned that Cloud Run requires binding to
$PORT=8080. - Context Limitations: I had to design token-aware logic to trim memory dynamically while preserving meaning.
🧠 Key Innovation
Unlike traditional AI memory systems that store entire histories, MemoraAI selectively remembers.
It filters, ranks, and compresses context so that the AI only recalls what’s truly useful — leading to faster performance, lower latency, and consistent intelligence across conversations.
This makes it ideal for:
- Research assistants
- AI development copilots
- Journaling and therapy tools
- Persistent multi-session AI projects
🌍 Technology Stack
| Component | Technology |
|---|---|
| Frontend | Streamlit |
| Backend | FastAPI |
| Memory Store | Elasticsearch |
| Embeddings | Vertex AI textembedding-gecko@001 |
| Reasoning | Vertex Gemini 1.5 Flash |
| Hosting | Google Cloud Run |
💭 Reflection
MemoraAI began as a frustration but evolved into a mission — to make AI more human.
I realized that memory is not just about storing data; it’s about continuity, understanding, and identity.
With MemoraAI, I built a foundation for AI that doesn’t forget its purpose — or yours.
✨ Final Thought
“The best way to predict the future of AI is to give it a past.”
MemoraAI does exactly that — it remembers so your progress never resets.
🏆 MemoraAI — Turning Conversations Into Memory
Built With
- c
- c++
- cython
- elasticsearch
- fastapi
- fortran
- javascript
- python
- streamlit
- vertex
- vertexgemini
Log in or sign up for Devpost to join the conversation.