Memory Layer

Logo
Memory Layer Architecture

MemoryLayer: A Skeleton Framework for AI Applications with Persistent Memory

What Inspired Me

I kept hitting context window limits and losing important conversation history and context when starting new conversations. I wanted infrastructure that could extract and preserve important context, such as decisions, entities, and facts, so I could maintain continuity across interactions. What started as solving my own problem became MemoryLayer, a reusable skeleton that can power any AI app needing persistent memory.

Inspired by the MAKER framework (Meyerson et al., 2025), which solved million step tasks with zero errors through multi agent consensus, I implemented a MAKER style reliability layer for memory extraction. Using parallel micro agents with voting consensus, MemoryLayer catches hallucinations and inconsistent extractions that single LLM calls miss, making extractions more robust and production ready, whether you are building conversation managers, voice assistants, or entirely new applications.

What It Does

MemoryLayer is a modular memory infrastructure that gives AI applications the ability to remember and reason over past interactions. The framework consists of four focused packages: capturing conversations, extracting structured knowledge, storing in SQL with vector search, and building intelligent context for LLM calls.

For Kiroween’s Skeleton Crew category, MemoryLayer is the shared skeleton. Two production applications demonstrate its versatility:

Ghost: A voice controlled AI assistant for macOS with screen awareness using local OCR, voice first interaction with an optional global hotkey, native Apple Reminders integration, interactive D3.js knowledge graph visualization, and a real time dashboard with SSE streaming.
Handoff: A web based tool that transforms AI conversation exports into structured, reusable memories with ToS compliant import, LLM powered extraction, context block generation, and semantic search ready for continuity.

Both applications share no application code but run on the identical MemoryLayer foundation, proving genuine reusability across very different use cases.

What I Learned

Architecture Decisions Compound Over Time

By designing MemoryLayer as four independent packages, I could swap SQLite for Postgres without touching application code, replace OpenAI embeddings with local Transformers.js, add the MAKER reliability layer without breaking existing apps, and test each package independently. Building memory systems directly into each app would have resulted in duplicated effort and divergent implementations.

Local First Requires Solving Harder Problems

Making Ghost work completely offline required running embeddings locally, managing SQLite transactions across concurrent operations, handling macOS Vision framework permissions, and coordinating three processes without network dependencies. This constraint forced better architecture decisions that made the cloud version easier to build.

Multi Agent Consensus Catches Errors Single Calls Miss

Implementing the MAKER reliability layer taught me that single LLM calls are fragile in production. Three parallel calls with voting consensus catch hallucinated entities, malformed JSON, and inconsistent interpretations. Using Gemini 2.0 Flash Lite keeps this under one cent per extraction while providing production grade reliability.

How I Built It

Spec Driven Development with Kiro

Every component was built using Kiro’s spec driven methodology. The .kiro/specs/ directory contains comprehensive specifications for all four core packages and both apps, with requirements, design documents, and task breakdowns.

The development process followed three phases. First, I wrote detailed specifications defining system architecture, API contracts, and integration patterns. Second, using Kiro, I generated approximately 80 percent of the initial implementation, including TypeScript interfaces, React components, and API routes. Third, the remaining 20 percent consisted of Swift native services, D3.js visualization, and 155 total tests.

Key Architectural Decisions

Four Independent Packages: MemoryLayer separates storage for database operations, memory-extraction for LLM powered extraction, context-engine for semantic search, and an optional core wrapper. Each package can be used independently or swapped for custom implementations.
Database Abstraction Layer: The storage package works identically on SQLite for local development and Postgres for production. Applications switch databases by changing one configuration parameter.
Multi Process Architecture: Ghost uses three processes: a daemon for OS integration built with Electron, a backend for MemoryLayer coordination built with Node.js and Hono, and a dashboard for visualization built with React. Communication happens via HTTP with Server Sent Events for real time updates.

Challenges Faced

Token Budget Management

Building context for LLM calls requires balancing maximum token budgets, relevance thresholds, and recency bias. The context engine scores each memory using weighted factors, then packs memories by score until the token budget is exhausted.

Electron and Swift Coordination

Ghost’s daemon calls Swift scripts from Electron’s sandboxed environment. Swift code compiles to signed executables during build and executes via Node.js child_process.spawn. macOS permissions for Screen Recording, Accessibility, and Calendars must be explicitly granted and coordinated with these processes.

Conversation Chunking

Large conversations exceed LLM context windows and require chunking without losing context. The solution uses tiktoken for token accurate counting, configurable overlap to preserve context across boundaries, parallel extraction, and deduplication using deterministic IDs.

Database Reliability

I ran into a wide range of database issues, including schema design, migrations, connection limits, vector index configuration, and concurrent writes. Working through them hardened the storage layer so both SQLite and Postgres behave reliably under real workloads.

Built With

electron
gemini
hono
kiro
sqlite
supabase
typescript

Updates

Leslie Osei-Anane started this project — Dec 05, 2025 04:26 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.