Inspiration
My grandma has been battling Alzheimer’s for almost 20 years (we just celebrated her 90th birthday!). It started with small things—repeating stories, forgetting what she just did. She would confuse day with night, winter with summer. Gradually, she forgot her own name, then all of ours.
I grew up with my grandma. When others lost patience, I tried to listen—to the same stories over and over. But I wasn’t perfect. Sometimes, I had work to do; sometimes, I wanted to spend time with friends. And every time I saw her sitting alone, guilt washed over me.
Conversation is powerful for Alzheimer’s patients. It keeps their minds active and helps them feel connected and valued. My grandma also responded best to visual cues, and research shows that even an hour of daily visual engagement can slow cognitive decline.
That’s why I built Grandma’s Girl—an AI companion with infinite patience. One that listens, engages, asks questions, and brings conversationConversation analysiss to life with images. A companion who never gets tired, who keeps her mind engaged—who, maybe, just maybe, helps her hold on to her memories a little longer, one conversation at a time.
What It Does
Grandma's Girl is an AI companion that helps Alzheimer’s patients retain memories through conversation and visual aids. Inspired by caring for my grandma, it engages through speech and real-time image generation.
Key Features:
- Real-time voice conversations with an AI agent that speaks in a cloned voice of a familiar person (in this case, me!).
- Intelligent image generation using text-to-image models and NLP techniques to create visuals based on conversation context.
- A dynamic image gallery that visually maps stories, sparking memory recall.
- A visual history tracker that helps caregivers monitor which images resonate most.
- Conversation analysis to identify patterns in the patient’s recall, aiding caregivers in tracking cognitive decline over time.
How We Built It
This system is designed with a deep understanding of Alzheimer's care, ensuring warmth, familiarity, and engagement at every step.
Conversational AI & Voice Cloning
- Conversational AI: Uses ElevenLabs’ AI agents for natural speech-to-text and text-to-speech interaction.
- Voice Cloning: Familiar voices are essential. Alzheimer’s patients can feel anxious with unfamiliar ones, so I cloned my own voice as the granddaughter to provide comfort.
Carefully Designed System Prompts
- Basic Information Recall: Regularly prompts the patient about their name, hometown, and surroundings to reinforce memory and track cognitive function.
- Interactive Conversations with Visual Cues: The agent detects relevant images and asks if they match the patient's memories.
- Contextual Follow-Ups: Encourages the patient to recall what they just said, strengthening conversational continuity.
- Storytelling Focus: Prioritizes questions about childhood and past events, as Alzheimer’s patients often recall these more vividly. (Yes, it’s counterintuitive)
- Emotional Intelligence: Maintains a warm, patient, and positive tone—because making patients feel connected is just as important as cognitive recall.
Tool Use
Given the hackathon’s time constraints, I focused on essential tools:
- Date & Time Awareness: The agent provides real-time information to help patients stay oriented. Alzheimer’s patients often lose their sense of time, forgetting to eat or perform daily tasks.
- Knowledge Base: A structured text file with basic personal and contextual information for the agent to retrieve and verify responses.
Evaluation Criteria for the Agent
- Ensures the AI asks key questions (e.g., name, hometown, time of day).
- Accurately corrects misinformation using the knowledge base.
See the attached agent_criteria_evaluation.png in the Project Media section
Data Collection for Cognitive Tracking
- Extracts patient responses to basic questions.
- Helps caregivers and medical professionals track cognitive decline over time.
See the attached data_collection.png in the Project Media section
Automatic Image Generation Based on Conversation
- Real-time visual aid generation to bring memories to life.
- Model Choice: Flux Pro via Fal.ai API for personalized visuals.
- NLP-Driven Triggers: Uses spaCy to detect strong visual elements (e.g., "I lived by a river with a lot of fish in it") and trigger image creation.
Minimalist, Intuitive UI (React App)
- Designed for Simplicity: No distractions—just the conversation and its associated images.
- Gallery View: Past images are accessible on the side for recall.
- Consistent Visual Style: To maintain a uniform style across generated images, additional prompts were provided to
Flux Pro, and thefine-tune strengthwas adjusted. - Caregiver Insights: Helps caregivers track which images spark the strongest responses.
Tech Stack & Architecture
- Backend:
Flask-SocketIOfor real-time WebSocket communication. - NLP Processing:
spaCyfor noun extraction and image trigger detection. - Event-Driven Architecture: Ensures smooth interaction between conversation and image generation.
- Multi-threading: Prevents blocking operations during image generation.
- Frontend:
Reactfor a modern, responsive experience.
Challenges We Ran Into
- Too many ideas, too little time. So many features to build, so many tools to explore!
- Real-time updates & WebSockets. Ensuring smooth communication between the frontend and backend took time.
- The Infinite Conversation Loop. If the AI hears itself, it keeps talking… forever. (Fun, but expensive!) I spent a while debugging, thinking it was an async communication issue.
Accomplishments We’re Proud Of
- Built something I truly care about. My grandma is at an advanced stage, so she won’t benefit from this, but I hope it helps many others.
What We Learned
- ElevenLabs’ AI voice cloning is incredible. The accuracy is significantly better than speech-to-speech models, with almost no hallucinations.
- Fal.ai is super easy to use. Ready-to-use APIs made integration smooth.
- WebSockets can be tricky. Debugging real-time communication taught me a lot.
- Fine-tuned my voice clone. Turns out, my voice is way more monotonous than I thought—I had to tweak it multiple times to make it sound warm and engaging.
What’s Next for Grandma’s Girl
This project was built with Alzheimer's patients in mind, but its potential goes beyond that.
We all have images stuck in our minds. Drawing them out is difficult (unless you’re Picasso), but words come easier. Imagine an AI memory artist that turns our thoughts into visuals.
The Next Step? Interactive Image Refinement
Instead of generating a new image every time, the AI will:
- Use past images as references.
- Engage in back-and-forth refinement.
- Ask: "Does this look like what you had in mind?"
- Iteratively build a scene based on user feedback.
A co-creative AI that brings our mental images to life—one conversation at a time.
Final Thoughts
This hackathon gave me the opportunity to build something deeply personal. While I couldn’t pack in every feature I envisioned, this is just the beginning. I hope Grandma’s Girl can be refined, expanded, and used by those who need it most. 💙
Built With
- elevenlabs
- fal.ai
- flask
- javascript
- python
- react
- spacy
Log in or sign up for Devpost to join the conversation.