Evermore: An Empathetic AI Biographer for Seniors
About the Project
The Inspiration
Every family has that moment—sitting with an aging parent or grandparent, wishing we'd asked more questions, captured more stories, understood more about where we came from. By the time we realize the urgency, it's often too late. Memories fade, details blur, and entire lifetimes of wisdom vanish.
Evermore was born from a simple question: What if there was someone who always had time to listen?
We set out to build more than a chatbot. We created an empathetic AI companion specifically designed for seniors—one that treats their stories with dignity, remembers every detail, and helps preserve their legacy for generations to come.
What We Learned
Building Evermore taught us that AI for vulnerable populations requires a fundamentally different approach:
Voice-first is non-negotiable - Seniors won't use apps that require typing, tiny buttons, or complex navigation. Natural conversation is the only interface that works.
Memory isn't just storage—it's relationship - When the AI references something you said three weeks ago, it creates a sense of being truly known. We implemented a multi-tiered memory architecture (working, short-term, semantic, episodic) that makes every conversation build on the last.
Safety cannot be an afterthought - Seniors are vulnerable to scams, medical misinformation, and emotional manipulation. We architected safety guardrails directly into the AI's cognitive loop—not as filters, but as core reasoning principles.
Latency kills empathy - A 5-second pause in conversation destroys the emotional connection. We obsessively optimized to achieve <1 second first-response times through streaming, parallel execution, and edge deployment.
Hallucinations are catastrophic - When the AI invents details about someone's life, it's not just wrong—it's a violation of trust. We built an "Atom of Thought" (AoT) decomposition system that breaks complex tasks into verifiable atomic units before synthesis.
How We Built It
Evermore is an agentic AI system powered by advanced cognitive architecture:
🧠 The Cognitive Engine
We implemented a layered reasoning system moving from fast reactive patterns to slow deliberative planning:
- FSM (Finite State Machine): Deterministic guardrails prevent "doom loops"
- ReAct (Reason + Act): The AI "thinks" before it "speaks," querying memory and planning responses
- Chain of Thought (CoT): Hidden reasoning blocks improve empathy and logic
- Atom of Thought (AoT): Complex tasks decompose into parallel verifiable atoms before synthesis
🎙️ Voice-First Architecture
- Speech-to-Text: Google Cloud Speech with custom VAD (Voice Activity Detection)
- Natural Language Understanding: Google Vertex AI (Gemini 1.5 Pro/Flash) with custom empathy prompts
- Text-to-Speech: ElevenLabs Turbo v2.5 for natural, emotionally-aware voice output
- Latency Optimization: Streaming tokens directly to TTS, parallel RAG queries, edge deployment
🧬 Persistent Memory System
- Vector Storage: Pinecone for semantic search across all past conversations
- RAG (Retrieval-Augmented Generation): Context-aware responses that reference relevant memories
- Memory Decay: Realistic confidence scoring that degrades over time
- Authority Hierarchy: User corrections supersede AI inferences
🛡️ Safety & Wellbeing Layer
- Scam Detection: Identifies 10+ scam patterns (money requests, urgency tactics, impersonation)
- Crisis Intervention: Recognizes distress markers and escalates appropriately (988, 911, elder abuse hotlines)
- Medical Guardrails: Never provides medical advice; redirects to healthcare providers
- Hallucination Detection: Judge LLM verifies generated content against source transcripts
👨👩👧 Family Dashboard
- Real-time story notifications
- Photo upload to trigger memories
- Engagement tracking
- Story search and export
The Challenges We Faced
1. The "Grandmother Test"
Early prototypes worked great for engineers but confused actual seniors. We learned to:
- Eliminate all visible UI complexity
- Make the AI proactively initiate conversations
- Handle long silences gracefully (10-second timeout with encouraging prompts)
- Never assume technical literacy
2. Latency vs. Quality Trade-off
Streaming responses reduce latency but risk incomplete thoughts. We solved this with:
- Hybrid architecture: Fast acknowledgment + thoughtful completion
- Model routing: Use Gemini Flash for classification, Pro for reasoning
- Speculative execution: Pre-fetch likely memory queries in parallel
3. Memory Consistency
RAG systems can retrieve conflicting information. We implemented:
- Authority hierarchy (user correction > recent memory > old memory)
- Confidence signaling ("You mentioned..." vs. "I believe you said...")
- Supersession chains for corrections
4. The Empathy Gap
Generic LLM responses felt robotic. We engineered empathy through:
- Custom persona prompts trained on oral history best practices
- Emotional valence detection and mirroring
- Active listening patterns (validate → reflect → ask)
- Hidden reasoning blocks that plan emotional tone
5. Cost Control
Unconstrained agentic loops could spiral costs. We built:
- Per-session budget guards ($0.20 limit)
- Model routing (Flash vs. Pro based on task complexity)
- Graceful degradation when budgets hit 10%
- Real-time cost tracking and alerts
What's Next
Evermore is just the beginning. Our roadmap includes:
- Multilingual support (Spanish, Mandarin prioritized)
- Printed legacy books with one-click ordering
- Video memoir creation combining stories, photos, and voice
- B2B partnerships with senior care facilities
- Voice cloning (ethical, consent-based) for legacy preservation
Built With
Core Technologies
- Next.js 14 (App Router, Server Components)
- TypeScript 5.x (Type-safe throughout)
- TailwindCSS + Framer Motion (UI/Animations)
- Zustand (State management)
AI & ML Stack
- Google Vertex AI (Gemini 1.5 Pro/Flash for reasoning)
- Google Imagen 2 (Image generation)
- ElevenLabs Turbo v2.5 (Text-to-Speech)
- Google Cloud Speech (Speech-to-Text)
- Pinecone (Vector database for semantic memory)
Infrastructure
- PostgreSQL (CockroachDB Serverless)
- Drizzle ORM (Type-safe database queries)
- Upstash Redis (Session caching, rate limiting)
- Vercel (Edge deployment, serverless functions)
Architecture Patterns
- Clean Architecture (Domain-driven design, ports & adapters)
- ReAct Agent Loop (Reasoning + Acting)
- RAG (Retrieval-Augmented Generation) (Memory-enhanced responses)
- Algorithm of Thoughts (AoT) (Task decomposition for complex generation)
Try It Out
- 🌐 Live Demo: https://evermore-bay.vercel.app
- 💻 GitHub Repository: https://github.com/aero-atlassian-apps/evermore
- 📖 Documentation: Full Technical Docs
- 🎥 Video Demo: Watch on YouTube
Impact
Evermore addresses a critical need: 56 million US seniors are aging into memory challenges, and 70% of adult children regret not capturing their parents' stories. We're building a solution that's:
- Accessible: Voice-first design requires no technical literacy
- Affordable: $19.99/month vs. $1,000+ for one-time oral history services
- Scalable: Cloud-native architecture supports thousands of concurrent conversations
- Safe: Multi-layered protection against scams, exploitation, and harm
Every story preserved is a family's heritage saved. Every conversation is dignity restored.
Built with ❤️ for seniors and the families who love them.
Log in or sign up for Devpost to join the conversation.