Inspiration
The Iran conflict put something obvious in front of us: People leaving their homes forever with a phone in their pocket. 1 in 67 people on Earth has been forcibly displaced. They all have footage. Nobody built the pipeline from a shaky phone video to a place you can actually walk back into, so we did.
What it does
Recall turns any phone video into an interactive 3D space you can walk through in a browser. You film a place, upload it, and the system reconstructs it as a Gaussian Splat, which is a navigable point cloud you can orbit, walk around, and step back into. No cloud and no account, so your places stay yours.
How we built it
Phone video gets stripped into frames via OpenCV, fed into MonoGS for monocular SLAM tracking and Gaussian optimization, and output as a .ply file. A FastAPI backend running on a local ASUS NUC handles job queuing, status polling, and file serving. The frontend is Three.js for the globe and 3d gaussian splats for rendering splats in the browser. Memories are cached in OPFS, so the whole thing keeps working even when the server is off.
Challenges we ran into
MonoGS assumes smooth camera motion. People do not film that way. Getting clean reconstructions from casual handheld footage required tuning around frame extraction rate, resolution scaling, and calibration assumptions. We also lost hours to a coordinate system mismatch: MonoGS is Y-down, Three.js is Y-up, which turned out to be a single 180 degree rotation placed in the wrong spot. The other challenge was building offline-first from the start: OPFS, static memory bundles, a three-layer loading fallback. Not glamorous, but it is what makes the thing usable in the field.
Accomplishments that we're proud of
The entire pipeline runs on consumer hardware with no cloud dependency. A place filmed on a phone becomes something you can walk around in a browser, stored locally, accessible indefinitely. That loop closing end-to-end was the moment it felt real.
What we learned
That the gap between a research demo and software that works on real footage is enormous. MonoGS is a powerful system. Getting it to handle the way humans actually hold phones, move through rooms, and capture the places they care about required confronting every assumption baked into the academic pipeline.
What's next for Recall
Better reconstruction quality on challenging footage. Mobile capture guidance to help users film in ways that produce cleaner splats. Shared memory spaces so families can contribute footage of the same place from different devices. And longer term: a lightweight export format that works in VR, so you can not just view a place but stand in it at full scale.
Log in or sign up for Devpost to join the conversation.