screenshot showing memoraAI UI
screenshot showing state of chatGPT when a project conversation gets too big

🌌 MemoraAI — The Memory Layer for Intelligent Collaboration

“I developed MemoraAI to fix the one thing that limits every AI - increased response latency with increased conversation size ”

💡 Problem Statement

While working on long-term projects using tools like ChatGPT and other AI assistants, I noticed something frustrating — as the conversation grows, the performance and response speed start to drop.
The model becomes slower, contextually confused, and eventually, even branching to a new chat doesn’t help because the entire historical context is lost.

It felt like every time I made progress, I had to start over. Refeeding the same data into a new session was time-consuming and inefficient — especially for developers, researchers, and creators who depend on AI continuity.

That’s when I asked myself:

“What if AI could remember — just like we do?”

So I built MemoraAI — a system that lets AI retain knowledge, recall relevant context instantly, and stay consistent across sessions.

🚀 The Vision

MemoraAI is designed to preserve and retrieve memory context dynamically for large language models.
Instead of sending the entire chat history each time, MemoraAI uses a semantic memory architecture — enabling faster, cheaper, and smarter continuity for AI-driven workflows.

It’s not a chatbot. It’s a memory infrastructure for intelligent systems.

⚙️ How I Built It

MemoraAI combines Vertex AI, Elasticsearch, and FastAPI to simulate human-like long-term memory for AI models.

🧩 Architecture Overview

User → Streamlit Interface → FastAPI API → Vertex AI + Elasticsearch

Ingestion: Every interaction is split into chunks, embedded using Google’s textembedding-gecko@001, and stored as a dense vector in Elasticsearch.
Retrieval: On every new query, MemoraAI performs a hybrid search — combining lexical match and semantic similarity.
MMR Reranking: I implemented Maximal Marginal Relevance (MMR) to select the most relevant and diverse memories.
Generation: The selected context is merged into a system prompt and passed to Vertex Gemini for reasoning and response generation.
Persistence: Even after sessions end, memory remains — ensuring project continuity across time.

🧮 The Math Behind MemoraAI

The MMR (Maximal Marginal Relevance) approach ensures context is both relevant and diverse:

[ \text{MMR} = \arg\max_{D_i \in C} [ \lambda \cdot \text{sim}(Q, D_i) - (1 - \lambda) \cdot \max_{D_j \in S} \text{sim}(D_i, D_j) ] ]

where:

( Q ): Query vector
( D_i ): Candidate document
( S ): Already selected context
( \lambda ): Relevance-diversity tradeoff

This ensures MemoraAI recalls the right context at the right time — just like selective human memory.

🔬 What I Learned

Building MemoraAI taught me the art of balancing intelligence and performance.
Some lessons include:

Memory retrieval matters more than storage — it’s about understanding context relevance.
Vertex AI permissions and IAM roles are powerful but require careful setup for secure access.
Hybrid search (semantic + lexical) dramatically improves recall quality.
Prompt design can transform how an AI uses memory.

⚔️ Challenges I Faced

Vertex AI Access: Configuring the right service accounts and model permissions was complex.
Elastic Vector Mapping: The 768-dimensional embeddings had to be correctly defined for cosine similarity.
Cloud Run Deployment: Containers failed until I learned that Cloud Run requires binding to $PORT=8080.
Context Limitations: I had to design token-aware logic to trim memory dynamically while preserving meaning.

🧠 Key Innovation

Unlike traditional AI memory systems that store entire histories, MemoraAI selectively remembers.
It filters, ranks, and compresses context so that the AI only recalls what’s truly useful — leading to faster performance, lower latency, and consistent intelligence across conversations.

This makes it ideal for:

Research assistants
AI development copilots
Journaling and therapy tools
Persistent multi-session AI projects

🌍 Technology Stack

Component	Technology
Frontend	Streamlit
Backend	FastAPI
Memory Store	Elasticsearch
Embeddings	Vertex AI `textembedding-gecko@001`
Reasoning	Vertex Gemini 1.5 Flash
Hosting	Google Cloud Run

💭 Reflection

MemoraAI began as a frustration but evolved into a mission — to make AI more human.
I realized that memory is not just about storing data; it’s about continuity, understanding, and identity.
With MemoraAI, I built a foundation for AI that doesn’t forget its purpose — or yours.

✨ Final Thought

“The best way to predict the future of AI is to give it a past.”

MemoraAI does exactly that — it remembers so your progress never resets.

🏆 MemoraAI — Turning Conversations Into Memory

Built With

c
c++
cython
elasticsearch
fastapi
fortran
javascript
python
streamlit
vertex
vertexgemini

Updates

Evans Kaila started this project — Oct 23, 2025 06:32 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.