Inspiration
Our teammate Sunghoo spent time as a research assistant at Rutgers Medical School studying neurodegenerative diseases — primarily Alzheimer's and dementia. He brought a specific observation to the team: an enormous share of caregiver-patient interaction consists of the same questions repeated on loop. Did you take your medication? Where did you put your glasses? Did someone come to the door? These aren't logistics. They're the texture of cognitive decline, and they consume an extraordinary amount of a caregiver's emotional bandwidth.
The scale is hard to hold in your head. An estimated 7.2 million Americans aged 65 and older are living with Alzheimer's dementia in 2025, a number projected to roughly double by 2060 (Alzheimer's Association). Behind every one of those numbers is a family member answering those same questions. And the harder version of the problem is invisible: the patient themselves, in earlier stages, knows they're losing something and tries to compensate — writing notes, setting alarms, asking the same question hoping for a different answer.
What they actually need is something almost no technology provides: ambient memory for the physical space they live in. Not a phone to pick up. Not an app to open. Just a room that remembers.
That is what Recall is.
What it does
Recall is a small wall-mounted device that watches a room and extracts events, not video. On-device. Continuously. Privately.
Ask it anything in plain language:
Where are my keys? → On the counter, about 20 minutes ago.
Did I take my morning medication? → Yes, at 8:02 AM — you picked up the orange bottle and drank water.
When did the delivery arrive? → At 2:47 PM. The driver left a package by the door.
You either type on the laptop dashboard or press the physical button on the device and speak. You get an answer, cited to the specific events it was inferred from, so you can trust it. If Recall didn't see something happen, it tells you honestly — it doesn't fabricate.
Above the query layer sits a proactive agent that reads the event log alongside a calendar and contact book, detects when scheduled medications are overdue, and drafts a tactful SMS to a caregiver ready to send. Not a prompt. An artifact.
How we built it
The system is three layers talking across a privacy boundary.
The Pi — the on-device brain. A Raspberry Pi 4B runs a vision pipeline using YOLOv8-nano for object detection and ByteTrack for persistent object identity across frames. A rule-based event extractor watches state transitions: when an object appears for 3 consecutive frames, it emits object_placed; when it disappears for 10 frames, it emits object_picked_up. Actions like drinking are detected from spatial relationships between person and object bounding boxes. Events are written to a local SQLite database with one heavily blurred 128×72 thumbnail per event.
The backend — the reasoning layer. A FastAPI server runs on the Pi itself, exposing a /query endpoint. When a question comes in, the server assembles the relevant events into structured context and sends it to a reasoning model — K2 Think V2 as the primary reasoner for multi-step temporal questions, Claude 4.7 as the reliability failover. Answers come back with confidence levels and event-ID citations.
The dashboard — the glass the user looks through. A Next.js app on a laptop subscribes to a WebSocket feed of live events, renders them in a real-time timeline, accepts typed or voice queries via the Web Speech API, and reads answers aloud via the browser's speech synthesis.
The principle we refused to compromise
A camera watching your home has an obvious problem, and we've seen enough surveillance capitalism to know that "we promise we won't misuse it" is not a trust model. So we made the product impossible to misuse by design.
Raw video is never stored. Ever. On any disk. Frames are processed and immediately discarded. What persists is structured event text — a day of activity fits in kilobytes — plus one heavily blurred thumbnail per event, kept only for UI confirmation. Cloud services only ever see event text, never pixels. This isn't marketing; it's the architecture.
Grounded in real research
Recall is not an invented problem. It has a name in the literature: Episodic Memory Question Answering (EMQA). Bärmann and Waibel (2022), in a paper titled — fittingly — "Where did I leave my keys?", formalized the task and established the core memory constraint:
$$|e| \in \mathcal{O}(1), \quad |e| \notin \mathcal{O}(N)$$
where $N$ is the length of the input video and $|e|$ is the size of the memory representation. Any real episodic memory system must compress arbitrary-length experience into a bounded representation. That's the exact design principle Recall obeys — we don't store video, we store events.
Challenges we ran into
Getting the Pi to boot at all. We went through three Raspberry Pi 4B units and multiple SD cards before finding a configuration that worked — we eventually booted from USB instead of SD card, bypassing whatever was failing. This cost us several hours we hadn't budgeted for.
Object detection vocabulary. YOLOv8-nano's COCO training set has 80 classes and none of them are "pill bottle" or "keys." We built the demo around substitute classes (scissors for pill bottle, remote for keys) with a display-layer label translation so the user never sees the raw COCO label. Proper fine-tuning on medication containers is future work.
The contract problem. Three components (capture → backend → dashboard) speaking different JSON shapes is how hackathon teams lose Sunday morning. We wrote a formal CONTRACTS.md early, froze the event schema, and integration took minutes instead of hours. Most projects skip this; we won't next time either.
Track ID churn in ByteTrack. When a person leaves the frame and returns, ByteTrack reassigns a new track ID. For object continuity this isn't a problem (objects stay in place), but for person-level analytics it's a known limitation we've logged for future hardening.
Accomplishments that we're proud of
We shipped the system. The full pipeline — camera to events to database to LLM reasoning to dashboard to voice — is running end-to-end on the hardware we brought.
We honored the privacy contract in the architecture, not the marketing copy. A stranger with our disk would find no video and no faces.
We cited the research this product descends from because we actually read it — not because citations look good in submissions. The EMQA task has been sitting in the literature waiting for a product. We built it.
We engineered orthogonal demo-mode and fixture-mode flags so the system is resilient at judging time: live CV with safe reasoning paths, independently dial-able. Most hackathon projects don't survive their demo. We designed ours to.
What we learned
Most privacy claims are marketing. Making one true is architectural. "We care about your privacy" takes seconds to write. Designing a system where it's physically impossible to leak video because no video is ever written to disk takes real engineering decisions. We've learned to tell the difference.
The hardest part of building with LLMs is not the model — it's the contract. When a JSON schema drifts by a single field at 2 AM, everything breaks in ways that look like AI problems but are actually coordination problems.
"Ambient" is a product category, not a buzzword. Something that just watches, just listens, and just answers when you happen to ask is closer to furniture than to software. For a patient at the limit of what they can remember to do, that difference is everything.
What's next for Recall
Clinical pilot. Medication adherence is the single most-tracked variable in dementia pharmacotherapy trials, and it's the data current methodology struggles most to capture objectively. Recall could supply it without requiring self-report — directly aligned with the Regeneron track's decentralized-clinical-trials framing.
Fine-tuned object classes for healthcare. A small dataset of pill bottles, inhalers, walkers, and glucose meters would dramatically raise detection accuracy on medically relevant objects.
Wearable form factor. The EMQA literature studies egocentric (head-mounted) video. A version of Recall that clips to glasses would serve mobility for visually-impaired users and early-stage dementia patients who still leave the home.
Long-term temporal reasoning. Right now Recall answers questions about the last day. With better indexing and summarization, it could answer questions about patterns across weeks: "Has mom been more restless at night lately?" That's the question a caregiver actually wants answered.
Recall was built at HackPrinceton 2026 by Jossue, Ariji, Sunghoo, and Jeeyan. We designed it for the 7.2 million Americans with Alzheimer's dementia, their families, and the caregiving workforce behind them. The memory it provides isn't a substitute for the one that's being lost — it's a way for the room to carry some of the weight.
Built With
- anthropic
- bytetrack
- claude
- fastapi
- github
- logitech
- next.js
- opencv
- python
- raspberry-pi
- react
- sqlite
- tailwind-css
- typescript
- ultralytics
- uvicorn
- web-speech-api
- websocket
- yolov8




Log in or sign up for Devpost to join the conversation.