Foot Prints

Grandpa Harolds Memories Mapped.

Inspiration

In fifty years, every app your grandparents ever used will be gone. Facebook, Google Photos, every voicemail — and with them, every story they ever told. We wanted to build a digital heirloom: something you'd actually leave to your kids, the way your grandparents left you a photo album. Not a subscription. Not a feed. An archive that outlives the platform that made it.

The trigger was personal. Our grandparents have told us the same stories for twenty years — a summer trip in 1947, a wedding in Havana before the Revolution — and we realized nobody in the family has written any of them down. The moment those voices are gone, the stories are gone. And in a world where AI can now fake anyone's voice, even a recording isn't enough on its own anymore. You need proof it came from the real person, at the real moment.

What it does

Footprints is a voice-first archive for a grandparent's life. You hand them a microphone. They talk naturally — rambling, remembering, getting dates wrong. When they're done, our pipeline turns that raw monologue into a cinematic, cryptographically-sealed story:

Automatic structure. Gemini chunks the transcript into titled stories and extracts every date, place, and person mentioned.
AI fills in fading memory. When the speaker says "sometime in the forties, before the war ended," Gemini reasons with context ("Roosevelt was still alive") to narrow it down — and our specialist agents on Featherless cross-reference Wikipedia and the Library of Congress to confirm.
Real artifacts, not AI-generated. Every place gets geocoded. Every person gets cross-referenced with Wikipedia. Period newspaper pages come from the actual Library of Congress digital archive. Nothing is fabricated.
A polished narration + bespoke music. ElevenLabs rewrites the rambling transcript into a warm narrator voice and composes a period-appropriate instrumental bed — every single story gets its own soundtrack.
A cinematic playback. The map animates the journey. Artifact postcards fade in as each place is mentioned. A moving playhead traces the path in sync with the voice.
A certificate of authenticity. Every finished story is sealed on Solana as a Metaplex Core NFT — with the content hash of the raw audio, the recording timestamp, and the speaker identity cryptographically anchored. Tap the "✓ Authenticated" badge and Solana Explorer opens, proving this came from the real person.

Zoom out to the home globe and an entire life appears as golden arcs between places — decades of memories, each one queryable with MongoDB Atlas Vector Search. Type "tell me about his brother" and the archive semantically matches stories that never even used the word.

How we built it

A full-stack AI pipeline orchestrated across six sponsor stacks:

Next.js 15 on Vercel for the frontend — live recording, a cinematic /story/[id] player, a 3D globe home built with react-globe.gl and Three.js.
MongoDB Atlas stores users, archives, raw sessions, processed stories, and the vector index that powers semantic search across the family's own archive.
Gemini 2.5 is the brain of the post-call pipeline. It reads the full rambling transcript, chunks it into coherent titled stories, extracts entities (dates, places, people, events), and resolves fuzzy dates by reasoning with context.
Featherless AI hosts our department of specialists. A history agent (Qwen 2.5 14B) and a music-director agent run in parallel via asyncio.gather; once their output is back, a narrator agent (Llama 3.1 8B) uses the extracted keyFacts to write the polished script. Right model per job, not one generalist doing everything.
ElevenLabs is the entire audio layer: Scribe (transcript-ready captions), Turbo v2.5 TTS at an unhurried 82% speed for archival tone, and the Music API composing a fresh instrumental per story.
Mapbox GL JS draws the 2D animated journey, warmed with a CSS sepia filter and paper-grain overlay to feel like an archival wall map.
Library of Congress Chronicling America + Wikipedia REST + OpenStreetMap Nominatim provide the enrichment artifacts. Every newspaper, portrait, and geocoded place is a real, cited public record.
Solana (devnet) and Metaplex Core seal each story as an attested NFT — content hash, timestamp, speaker — minted from a server wallet to the user's Privy-custodial wallet.
Privy handles auth + embedded Solana wallets. Sign in with email, get a custodial wallet in the background, no seed phrase, no MetaMask. (worker/storage.py). Due to a new-account review hold we pivoted to Vercel for the live URL, but the full Vultr deploy stack ships in /deploy and is production-ready.

Challenges we ran into

Gemini Live tool-calling was unreliable for mid-conversation memory-gap filling, so we moved that reasoning entirely into the post-call Gemini 2.5 orchestrator. Same outcome, far more stable.
Solana transaction size. Our first NFT mints hit the 1232-byte limit because we were inlining full metadata JSON. We moved to a URI-based metadata route (/api/story/[id]/metadata) so only the pointer goes on-chain.
Wikipedia's OpenSearch fallback is wrong-answer-happy — it once pulled "Aunt Bee" (Andy Griffith character) for a story that mentioned "Aunt Bea." We curated the demo story's artifacts by hand and added a stricter matcher for live recordings.
Processing-page pacing. The live pipeline sometimes finishes in 8 seconds — too fast to narrate over on camera. We added a demo-mode minimum-time gate that floors the processing screen at 75 seconds with a scripted agent-card progression that real worker progress overrides per-key.
Vultr account review. Our new account was held for 24h review right before the deadline. We'd already built the full Vultr deploy stack — systemd units, nginx config, Object Storage integration, one-shot deploy.sh — so we pivoted the live URL to Vercel while keeping the Vultr code path ready to flip.
Narrator pacing. Default ElevenLabs TTS felt rushed for the archival tone. Turbo v2.5 is the only production model that exposes an explicit speed knob — we landed at 0.82 and it instantly felt right.

Accomplishments we're proud of

Emotional demo. The moment Jackie Robinson steps up to the plate in the fourth inning — map pins, music bed, narrator voice, and the Wikipedia portrait all sync at the right beat. Judges won't forget it.
Six sponsor stacks with load-bearing roles. Every sponsor's tech does real work. Nothing is stickered on.
A real, deterministic demo. We baked the Ebbets Field story through the actual production pipeline and committed the result to the repo so the demo is survivable against a DB wipe or fresh clone.
Real archival authenticity. The Jackie Robinson image is from Wikimedia Commons. The Ebbets Field photo is from 1913. The Pennsylvania Turnpike logo is authentic. Nothing here is AI-generated imagery.
A meaningful Solana use. Not a JPEG speculation. An attested content hash that solves a 2028 problem: proving a recording is real when AI can fake any voice.

What we learned

The right model for each narrow job beats one generalist. Featherless's catalog made that feasible.
Vector search on a tiny archive (one family's stories) is wildly useful — semantic matching across a small corpus finds things that literal search never would.
Solana mint payloads need to be tiny. Store metadata externally; anchor only the hash on-chain.
In a demo video, timing floors matter more than timing ceilings. The risk isn't that things take too long — it's that they finish in 4 seconds and you have nothing to narrate over.
Build the kill-flags in the first two hours or you'll discover them at 2am.

What's next

Voice cloning opt-in. With explicit consent, the narrator voice can become the grandparent's own voice — which is closer to the real emotional artifact.
Shared family archives. Today an archive is per-user. Making them multi-custodian lets cousins, siblings, grandkids all contribute.
A gift box. QR-code keepsake that opens an authenticated Footprints archive. Something a grandkid actually puts on the mantle.