memento

Memento 🧠👓

A memory-support system for dementia patients powered by voice, vision, and AI

About the project

Memento is an assistive technology platform designed to support individuals living with dementia by helping them recognize familiar faces, remember conversations, and reduce daily anxiety caused by memory loss.

By combining smart glasses, real-time facial recognition, voice transcription with diarization, and AI-powered caregiver assistance, Memento functions as an external, supportive memory layer, operating passively in the background while preserving the user’s dignity and independence.

Inspiration 💡

Dementia patients often struggle not just with forgetting facts, but with forgetting people, faces, names, relationships, and recent conversations. This leads to confusion, repeated questions, and emotional distress for both patients and caregivers.

We were inspired by a simple but powerful question:

What if technology could quietly help someone remember who they’re talking to, and what just happened, without requiring effort or technical skill?

Rather than building another reminder app, we focused on human signals: faces and voices. Memento was inspired by the idea that memory support should be passive, contextual, and empathetic, especially for cognitively vulnerable users.

What it does 🛠️✨

Memento is composed of three tightly integrated systems:

🧑‍💻 Dynamic Web Application built using Next.js with a Flask backend, RESTful APIs and MongoDB as a proof of concept

Patients can register an account , with full authentication and manage their personal memory space
They can add people they know (family, friends, caregivers)
For each acquaintance, the app stores:
- Name
- Relationship
- Personal summary
- Facial embeddings
All data is securely stored in MongoDB

👓 Real-Time Facial Recognition (Smart Glasses Simulation)

Smart glasses capture live video
Faces are processed using InsightFace
Facial embeddings are compared using cosine similarity:

[ \text{similarity}(A, B) = \frac{A \cdot B}{|A| |B|} ]

When a match is found:
- The person’s name, relationship, and summary are retrieved
- This information can be surfaced to the user or caregiver in real time
This helps dementia patients recognize who they are interacting with, reducing fear and confusion

🎙️ Voice & Conversation Memory (Diarization + AI)

A separate web application transcribes conversations in real time
Built using the Gemini API
Uses speaker diarization to determine who spoke when
Conversations are:
- Timestamped
- Speaker-labeled
- Stored for later review
We also built a Gemini-powered caretaker agent 🤖 that can:
- Answer questions about recent interactions
- Help caregivers understand conversation context
- Assist in reassurance and recall

How we built it 🧱

Frontend

Next.js for the main patient-facing application
Clean, accessible UI focused on simplicity
Separate web interface for conversation transcription and review

Backend

Flask (Python) servers for:
- Facial recognition pipeline
- Voice processing and diarization
MongoDB for:
- Facial embeddings
- User profiles
- Conversation transcripts
- Speaker metadata

AI & ML

InsightFace for face detection and embedding generation
Cosine similarity for fast and reliable face matching
Gemini API for:
- Speech-to-text
- Speaker diarization
- Caretaker conversational agent

A major portion of development time was spent tuning diarization parameters, balancing:

Over-segmentation vs under-segmentation
Noise robustness
Latency vs accuracy

This was critical, as incorrect speaker attribution can be harmful in dementia care.

Challenges we ran into ⚠️

Reliable facial recognition in real-world conditions (lighting, angles, motion)
Diarization accuracy in overlapping or noisy conversations
Latency constraints when processing live video and audio streams
Designing for cognitive vulnerability, where errors have emotional impact
Integrating multiple AI systems (vision + voice + agent) into a coherent experience

Accomplishments that we’re proud of 🏆

Built a working end-to-end system from smart glasses to AI-powered memory recall
Successfully integrated InsightFace + Gemini in meaningful, non-trivial ways
Designed a system for dementia patients which also works all they from people who forget the little things to the people who struggle in their daily lives
Created a foundation that could realistically be extended into clinical or caregiving environments

What we learned 📚

Assistive technology must be passive and forgiving
Facial recognition alone isn’t enough context matters
Voice AI becomes powerful only when paired with accurate diarization
Designing for dementia fundamentally changes how you think about UX, safety, and reliability
Small technical errors can have large emotional consequences