Evermore: An Empathetic AI Biographer for Seniors

About the Project

The Inspiration

Every family has that moment—sitting with an aging parent or grandparent, wishing we'd asked more questions, captured more stories, understood more about where we came from. By the time we realize the urgency, it's often too late. Memories fade, details blur, and entire lifetimes of wisdom vanish.

Evermore was born from a simple question: What if there was someone who always had time to listen?

We set out to build more than a chatbot. We created an empathetic AI companion specifically designed for seniors—one that treats their stories with dignity, remembers every detail, and helps preserve their legacy for generations to come.

What We Learned

Building Evermore taught us that AI for vulnerable populations requires a fundamentally different approach:

Voice-first is non-negotiable - Seniors won't use apps that require typing, tiny buttons, or complex navigation. Natural conversation is the only interface that works.
Memory isn't just storage—it's relationship - When the AI references something you said three weeks ago, it creates a sense of being truly known. We implemented a multi-tiered memory architecture (working, short-term, semantic, episodic) that makes every conversation build on the last.
Safety cannot be an afterthought - Seniors are vulnerable to scams, medical misinformation, and emotional manipulation. We architected safety guardrails directly into the AI's cognitive loop—not as filters, but as core reasoning principles.
Latency kills empathy - A 5-second pause in conversation destroys the emotional connection. We obsessively optimized to achieve <1 second first-response times through streaming, parallel execution, and edge deployment.
Hallucinations are catastrophic - When the AI invents details about someone's life, it's not just wrong—it's a violation of trust. We built an "Atom of Thought" (AoT) decomposition system that breaks complex tasks into verifiable atomic units before synthesis.

How We Built It

Evermore is an agentic AI system powered by advanced cognitive architecture:

🧠 The Cognitive Engine

We implemented a layered reasoning system moving from fast reactive patterns to slow deliberative planning:

FSM (Finite State Machine): Deterministic guardrails prevent "doom loops"
ReAct (Reason + Act): The AI "thinks" before it "speaks," querying memory and planning responses
Chain of Thought (CoT): Hidden reasoning blocks improve empathy and logic
Atom of Thought (AoT): Complex tasks decompose into parallel verifiable atoms before synthesis

🎙️ Voice-First Architecture

Speech-to-Text: Google Cloud Speech with custom VAD (Voice Activity Detection)
Natural Language Understanding: Google Vertex AI (Gemini 1.5 Pro/Flash) with custom empathy prompts
Text-to-Speech: ElevenLabs Turbo v2.5 for natural, emotionally-aware voice output
Latency Optimization: Streaming tokens directly to TTS, parallel RAG queries, edge deployment

🧬 Persistent Memory System

Vector Storage: Pinecone for semantic search across all past conversations
RAG (Retrieval-Augmented Generation): Context-aware responses that reference relevant memories
Memory Decay: Realistic confidence scoring that degrades over time
Authority Hierarchy: User corrections supersede AI inferences

🛡️ Safety & Wellbeing Layer

Scam Detection: Identifies 10+ scam patterns (money requests, urgency tactics, impersonation)
Crisis Intervention: Recognizes distress markers and escalates appropriately (988, 911, elder abuse hotlines)
Medical Guardrails: Never provides medical advice; redirects to healthcare providers
Hallucination Detection: Judge LLM verifies generated content against source transcripts

👨‍👩‍👧 Family Dashboard

Real-time story notifications
Photo upload to trigger memories
Engagement tracking
Story search and export

The Challenges We Faced

1. The "Grandmother Test"

Early prototypes worked great for engineers but confused actual seniors. We learned to:

Eliminate all visible UI complexity
Make the AI proactively initiate conversations
Handle long silences gracefully (10-second timeout with encouraging prompts)
Never assume technical literacy

2. Latency vs. Quality Trade-off

Streaming responses reduce latency but risk incomplete thoughts. We solved this with:

Hybrid architecture: Fast acknowledgment + thoughtful completion
Model routing: Use Gemini Flash for classification, Pro for reasoning
Speculative execution: Pre-fetch likely memory queries in parallel

3. Memory Consistency

RAG systems can retrieve conflicting information. We implemented:

Authority hierarchy (user correction > recent memory > old memory)
Confidence signaling ("You mentioned..." vs. "I believe you said...")
Supersession chains for corrections

4. The Empathy Gap

Generic LLM responses felt robotic. We engineered empathy through:

Custom persona prompts trained on oral history best practices
Emotional valence detection and mirroring
Active listening patterns (validate → reflect → ask)
Hidden reasoning blocks that plan emotional tone

5. Cost Control

Unconstrained agentic loops could spiral costs. We built:

Per-session budget guards ($0.20 limit)
Model routing (Flash vs. Pro based on task complexity)
Graceful degradation when budgets hit 10%
Real-time cost tracking and alerts

What's Next

Evermore is just the beginning. Our roadmap includes:

Multilingual support (Spanish, Mandarin prioritized)
Printed legacy books with one-click ordering
Video memoir creation combining stories, photos, and voice
B2B partnerships with senior care facilities
Voice cloning (ethical, consent-based) for legacy preservation

Built With

Core Technologies

Next.js 14 (App Router, Server Components)
TypeScript 5.x (Type-safe throughout)
TailwindCSS + Framer Motion (UI/Animations)
Zustand (State management)

AI & ML Stack

Google Vertex AI (Gemini 1.5 Pro/Flash for reasoning)
Google Imagen 2 (Image generation)
ElevenLabs Turbo v2.5 (Text-to-Speech)
Google Cloud Speech (Speech-to-Text)
Pinecone (Vector database for semantic memory)

Infrastructure

PostgreSQL (CockroachDB Serverless)
Drizzle ORM (Type-safe database queries)
Upstash Redis (Session caching, rate limiting)
Vercel (Edge deployment, serverless functions)

Architecture Patterns

Clean Architecture (Domain-driven design, ports & adapters)
ReAct Agent Loop (Reasoning + Acting)
RAG (Retrieval-Augmented Generation) (Memory-enhanced responses)
Algorithm of Thoughts (AoT) (Task decomposition for complex generation)

Try It Out

🌐 Live Demo: https://evermore-bay.vercel.app
💻 GitHub Repository: https://github.com/aero-atlassian-apps/evermore
📖 Documentation: Full Technical Docs
🎥 Video Demo: Watch on YouTube

Impact

Evermore addresses a critical need: 56 million US seniors are aging into memory challenges, and 70% of adult children regret not capturing their parents' stories. We're building a solution that's:

Accessible: Voice-first design requires no technical literacy
Affordable: $19.99/month vs. $1,000+ for one-time oral history services
Scalable: Cloud-native architecture supports thousands of concurrent conversations
Safe: Multi-layered protection against scams, exploitation, and harm

Every story preserved is a family's heritage saved. Every conversation is dignity restored.

Built with ❤️ for seniors and the families who love them.