Inspiration LLMs are powerful, but they forget things as conversations get longer. Costs go up, latency increases, and important decisions get buried in context.
We wanted to build something that gives AI structured memory instead of just passing around huge transcripts.
What it does Lethus AI is a memory layer for LLM conversations. It keeps track of decisions, updates, issues, and resolutions in a structured way. Instead of sending the full conversation every time, it retrieves only what is relevant and combines it with a clean state document.
In our demo, we reduced token usage by around 60 to percent while still answering recall questions correctly.
How we built it We used Node.js, TypeScript, PostgreSQL, local embeddings, and OpenAI models. The system has two flows.
The hot path runs while the user waits. It calculates similarity, selects relevant spans of conversation, and sends a compressed prompt to the LLM.
The cold path runs after the response is returned. It stores embeddings, generates a changelog entry, and updates a structured state document.
This keeps latency low while maintaining long term memory.
Challenges we ran into Keeping the state document accurate over time was difficult. We had to handle overrides properly so old decisions did not stay active.
Retrieval was another challenge. Picking random similar messages broke reasoning chains, so we built a span-based selection approach instead.
Accomplishments that we are proud of We built a working memory system that:
- Cuts token usage by up to 80 percent
- Correctly remembers updated decisions
- Maintains structured state across 20 plus turns
- Runs with minimal added latency We are especially proud that it feels like infrastructure, not just a wrapper around an API.
What we learned Long conversations are not just about bigger context windows. They need better memory structure. Combining deterministic algorithms with small LLM updates works better than relying only on embeddings or summarization.
What is next for Lethus We want to turn Lethus into a production ready SDK with a dashboard for memory visibility, cross session memory, and better support for long running AI agents.
Our long term vision is to make Lethus the memory layer for reliable AI systems.
Built With
- docker
- express.js
- milvus
- nextjs
- openai
- postgresql
Log in or sign up for Devpost to join the conversation.