The Phoenix Protocol: A Story of User Memory in the Agentic Economy

What Inspired Me

I was frustrated. Every time I opened a new chat with ChatGPT, Claude, or Gemini, I found myself repeating the same information: "I'm on a budget," "I prefer direct communication," "I'm vegan now." The AI race was moving so fast, one month GPT-4 was better, the next month Claude 3.5 was leading, but none of these models remembered who I was across conversations.

The problem wasn't just about convenience. In the emerging agentic economy, where AI agents handle complex, multi-step tasks, there's a fundamental asymmetry: everyone is focused on agentic memory (how agents remember their own reasoning), but almost no one is focused on user memory (how agents remember the user's preferences, constraints, and context).

This insight became the core of Phoenix Protocol: In a world where users don't care which agent they use as long as it works best, the agent that remembers the user wins.

What I Learned

1. Graph Databases Are Perfect for Conflict Resolution

I learned that storing user preferences and constraints as a knowledge graph (using Neo4j) enables something vector databases can't: relationship-aware conflict resolution.

When a user says "I like Steakhouse X" but also has a constraint "No red meat (hard)," a vector search might return both as relevant. But a graph can traverse the CONFLICTS_WITH relationship and apply priority rules:

(:User)-[:PREFERS]->(:Restaurant {name: "Steakhouse X"})
(:User)-[:HAS_CONSTRAINT {priority: "hard"}]->(:Diet {rule: "No red meat"})
(:Restaurant)-[:CONFLICTS_WITH]->(:Diet)

The graph query can compute: "Hard constraint overrides preference → Recommend alternative restaurant."

2. Local-First Architecture Matters for Privacy

I built the Context Wallet as an encrypted SQLite database that lives on the user's machine. This taught me that local-first doesn't just mean "works offline", it means the user owns their data. The extension calls a local router (localhost:8787) that never sends raw conversations to external services. Only extracted semantic tuples (structured relationships) go to Neo4j, and even that can be self-hosted.

3. Non-Invasive UX Wins

Early prototypes tried to automatically inject context on every message. Users found it jarring. I pivoted to an "Enhance" button (🔥) that users click when they want context added. This gave users control while still making the magic obvious.

The extension detects ChatGPT, Claude, and Gemini using site-specific DOM selectors, then injects context by prepending formatted text:

const enhancedPrompt = `${contextPack.pack_text}\n\n${originalPrompt}`;

4. MemVerge Enables True Agent Immortality

For the agent demo, I integrated MemVerge Memory Machine Cloud to checkpoint agent RAM every 30 seconds. This taught me that state persistence isn't just about saving data, it's about freezing the entire execution state. When an agent crashes at step 25 of 50, MemVerge restores it to the exact step with all intermediate calculations intact.

The math is simple: if an agent task takes $T$ seconds and checkpoints every $C$ seconds, the worst-case recovery time is $C$ seconds, not $T$ seconds.

$$\text{Recovery Time} = \min(C, T_{\text{elapsed}})$$

5. Profile Systems Need Semantic Understanding

I built a profile system with 12 baseline questions (communication style, learning preferences, etc.) that extract into semantic tuples. I learned that semantic tags (like ["communication", "directness", "style"]) enable better matching than keyword search. When a user asks about restaurants, the system can match not just "food" preferences but also "budget" constraints and "diet" rules through semantic relationships.

How I Built It

Phase 1: Foundation (3-4 hours)

I started with a FastAPI router on localhost:8787 that would serve as the "Context Router", the service that generates Context Packs from Memory Cards. I built:

Context Wallet: Encrypted SQLite store for Memory Cards (local-first)
Chrome Extension skeleton: Content scripts that detect chat interfaces
Memory Card models: Pydantic models for constraints, preferences, goals, capabilities

The key insight: keep it simple. The wallet stores atomic "cards" (e.g., "No red meat", "Budget $50/month") that can be combined into context packs.

Phase 2: Neo4j Graph Integration (3-4 hours)

This was the "soul" of Phoenix. I integrated Neo4j AuraDB to store Memory Cards as a knowledge graph:

Nodes: User, Preference, Constraint, Goal, Entity
Relationships: PREFERS, HAS_CONSTRAINT, CONFLICTS_WITH, OVERRIDES

I wrote Cypher queries for conflict resolution:

MATCH (u:User)-[pref:PREFERS]->(restaurant)
MATCH (u)-[const:HAS_CONSTRAINT {priority: "hard"}]->(diet)
WHERE restaurant CONFLICTS_WITH diet
RETURN {
  resolution: "CONSTRAINT_WINS",
  reasoning: "Hard constraint overrides preference"
}

This enables the system to explain why a constraint wins, not just that it does.

Phase 3: LangGraph Agent (4-5 hours)

I built a stateful agent using LangGraph that demonstrates conflict-aware decision-making:

Planner node: Breaks tasks into steps
Executor node: Runs each step with Neo4j tool access
Verifier node: Checks progress
Responder node: Generates final response

The agent loads user context from Neo4j at the start of every task, enabling it to make decisions that respect both preferences and constraints.

Phase 4: MemVerge Integration (3-4 hours)

I integrated MemVerge MMC to checkpoint agent state:

Checkpoint node: Creates snapshots every 30 seconds
Restore endpoint: Brings agent back from any checkpoint
Demo script: Shows crash → restore workflow

The demo is powerful: start a long task, "crash" the agent mid-way, then restore it to the exact step it was on.

Phase 5: Frontend Dashboard (4-5 hours)

I built a React dashboard with:

Graph Viewer: Force-directed visualization of the Neo4j graph
Agent Brain: Real-time view of agent execution steps
Checkpoint Timeline: Visual timeline of checkpoints with restore buttons
Demo Controls: Quick actions for hackathon demos

The UI uses Tailwind CSS with a dark, cyberpunk aesthetic that matches the "Phoenix" theme.

The Chrome Extension

The extension is the user-facing magic. It:

Detects ChatGPT/Claude/Gemini using hostname matching
Finds textboxes using site-specific DOM selectors
Provides an "Enhance" button (🔥) that:
- Calls the Context Router with the draft prompt
- Receives a Context Pack (2-5 relevant cards)
- Prepends formatted context to the prompt
- Shows a UI chip with used cards

The extension also has an "Extract" button (📤) that sends conversations to the router for semantic tuple extraction, which then syncs to Neo4j.

Challenges I Faced

Challenge 1: Cross-Site DOM Compatibility

Problem: ChatGPT, Claude, and Gemini all use different DOM structures. ChatGPT uses <textarea>, Claude uses <div contenteditable>, and Gemini uses a mix.

Solution: I built site-specific selector arrays and a findElement() helper that tries each selector until one works. For contenteditable divs, I used document.execCommand('insertText') to properly sync React state.

const SELECTORS = {
  chatgpt: { textbox: ['#prompt-textarea', 'textarea[data-id="root"]'] },
  claude: { textbox: ['div[contenteditable="true"]', 'div.ProseMirror'] },
  gemini: { textbox: ['div[contenteditable="true"]', 'rich-textarea textarea'] }
};

Challenge 2: Graph Conflict Resolution Logic

Problem: How do you programmatically determine if a preference conflicts with a constraint? "I like Steakhouse X" vs "No red meat" is obvious, but what about "Budget $100" vs "Prefer premium products"?

Solution: I used Cypher queries with domain-specific conflict detection:

CASE 
  WHEN const.category = 'Budget' AND toFloat(pref.value) > toFloat(const.value)
  THEN true
  WHEN const.category = 'Diet' AND pref.category = 'Food'
  THEN true
  ELSE false
END as has_conflict

For edge cases, the system falls back to priority rules: hard constraints always override soft preferences.

Challenge 3: MemVerge Integration Without Production Access

Problem: I didn't have production MemVerge MMC access during development, but needed to demo checkpoint/restore.

Solution: I built a simulation layer that mimics MemVerge API calls. The service has two modes:

API mode: Calls MemVerge MMC API (production)
CLI mode: Calls mmcloud CLI commands (development)

The demo script works with both, making it easy to switch when production access is available.

Challenge 4: Profile Extraction from Conversations

Problem: Extracting structured semantic tuples from free-form conversations is hard. Users don't say "I have a constraint: No red meat." They say "I started a vegan diet 3 days ago."

Solution: I built a "distiller" service that uses a local LLM (Ollama) to extract semantic tuples. The prompt is carefully crafted to output JSON:

EXTRACTION_PROMPT = """
Extract semantic tuples from: "{context}"
Output: [{"subject": "User", "predicate": "HAS_CONSTRAINT", 
          "object": "Vegan Diet", "object_type": "Diet", ...}]
"""

This runs locally for privacy, then syncs structured tuples to Neo4j.

Challenge 5: Real-Time Agent State Updates

Problem: The frontend needs to show agent progress in real-time, but the agent runs in a background task.

Solution: I implemented polling with React Query. The frontend polls /api/agent/status/{session_id} every second while the agent is running. For production, this could be upgraded to WebSockets, but polling works fine for the demo.

Challenge 6: Context Pack Formatting

Problem: How do you format 2-5 memory cards into a context string that's useful but not overwhelming?

Solution: I built a formatter that groups by type and priority:

--- PERSONAL CONTEXT ---

HARD CONSTRAINTS:
- No red meat (diet)

SOFT PREFERENCES:
- Budget: $50/month (shopping)
- Communication style: Direct and concise

--- END CONTEXT ---

The "quiet" mode keeps it minimal; "verbose" mode includes explanations.

The Math Behind It

Context Retrieval Efficiency

If a user has $N$ memory cards and we want the top $k$ most relevant, a naive approach would score all $N$ cards. With semantic tags and domain filtering, we can reduce this:

$$O(N) \rightarrow O(\log N + k)$$

By indexing cards by domain and using graph traversal, we only score cards in relevant domains.

Checkpoint Overhead

If checkpoints take $C_p$ seconds and we checkpoint every $C_i$ seconds, the overhead is:

$$\text{Overhead} = \frac{C_p}{C_i} \times 100\%$$

For $C_p = 2$ seconds and $C_i = 30$ seconds, overhead is ~6.7%, which is acceptable for long-running tasks.

Conflict Resolution Confidence

When resolving conflicts, we compute confidence scores:

$$\text{Confidence} = \frac{\text{Constraint Priority} \times \text{Recency Factor}}{\text{Preference Strength}}$$

If confidence > 1.0, constraint wins; otherwise, preference wins with a warning.

What's Next

Phoenix Protocol is a proof-of-concept, but the architecture is production-ready. Future improvements:

WebSocket support for real-time agent updates
Self-hosted Neo4j option for complete privacy
Multi-user support with proper authentication
Mobile extension for iOS Safari
Agent marketplace where users can share profile templates

Conclusion

Phoenix Protocol taught me that in the agentic economy, user memory is the moat. While everyone races to build better agent reasoning, the agents that remember users across conversations, and across different AI providers, will win.

The project combines:

Graph databases for relationship-aware conflict resolution
Local-first architecture for privacy
Non-invasive UX for user control
State persistence for agent immortality
Semantic understanding for intelligent context matching

It's not just about remembering facts, it's about remembering who the user is and what they care about. That's the future of AI interaction.

Built with 🔥 for the agentic economy.

Built With

cypher
javascript
langchain
memverge
neo4j
openai
python
typescript

Updates

Sav Raghav started this project — Dec 19, 2025 12:58 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.