Rem | Devpost

Memory at the ping pong table
Memory on stage
AI-augmented Memory on stage
Landing page
Upload page

Inspiration

Memories as a whole are spatial, and one of the most powerful and nostalgic ways of capturing these memories is through photographs. They are the closest humans get to reliving their experiences, but we wondered if it was possible to get one step closer.

What it does

We came up with a way to relive your memories using Gaussian Splatting to reconstruct a 3D scene from natural language description of the memory, photos of the memory, or a video of the memory. Rem enables you to move inside the 3D space, which is customized by a creative agent to feel as closely as possible to how it felt in the moment.

How we built it

🎨 Frontend: Next.js 16 (App Router) + React 19 + Tailwind, with a three-screen flow (ingest → loading → 3D viewer)

🎙️ Voice input: intuitive browser-native Web Speech API for live transcription — no audio leaves the device

🧠 Agent-based personalization system: Gemini + Pika MCP power a lightweight agent pipeline that transforms raw inputs into a consistent memory representation and guided reconstruction.

We use two main components:

Structured memory agents (Gemini sub-agents) vision: extracts structured understanding from uploaded photos analyzer: combines past memories + current input into a recurring “world summary” extractor: isolates the relevant slice of that world for the new memory persona: builds a persistent, evolving visual identity across memories

These help maintain consistency across reconstructions.

Creative + tool-agent layer (Pika MCP) The system also includes a creative agent that generates the cinematic direction for each memory reconstruction. It takes the scene context and persona and translates them into a coherent visual style for the output.

Supporting tools include: fix_look: re-grades video (lighting, palette, mood, clothing, accessories) while preserving geometry and identity music: selects or generates audio to maintain emotional continuity

🌎 3D rendering: Three.js + gsplat for real-time Gaussian Splat rendering, with a key points extracted from SfM using COLMAP → gaussian initialization → rasterization and gradient descent → gaussian densification and pruning.

🔄 Pipeline: user input → personalization → 3D memory reconstruction

💾 Storage: Redis-backed job/scene store with an in-memory fallback, so the whole app degrades gracefully without infra

📈 Observability: OpenTelemetry tracing into Arize AX, plus an LLM-as-judge evaluator with a feedback loop for grading hotspot quality

Challenges we ran into

🎥 Consistent scene generation for 3D Gaussian Splatting (3DGS): Our initial approach relied on fully generated videos from Midjourney. However, 3DGS depends on Structure-from-Motion, which requires smooth camera motion and consistent scene geometry across frames. Generated videos frequently introduced temporal inconsistencies that degraded reconstruction quality.

To address this, we shifted toward real photos and videos while using Pika MCP to apply controlled personalization. This preserved the consistency required for reconstruction while still allowing creative modifications.

💸 Tool costs: We also experimented with Veo3-generated videos, which produced significantly better temporal consistency. Unfortunately, the cost of generating sufficient video data quickly exhausted our available credits. With greater resources, we believe a fully generative memory reconstruction pipeline could become feasible.

⏳ Training times: Training each splat took a significant amount of time (at least 30 minutes), and would sometimes hang for very long if the input had many photos or frames. The led us to spend a lot of effort trying different training inputs, from generated videos to generative mesh view points. In the end, we realized that sampling every other frame in a video could significantly speed up the process with minimal impact on visual quality.

Accomplishments that we're proud of

🎮 UI: We built a Three.js-powered viewer that successfully captures the feeling of stepping back into a memory rather than simply viewing media.

✨ 3DGS Quality: Despite having only a single day to develop and iterate, we achieved surprisingly strong reconstruction quality. We were especially excited to reconstruct a live human subject with limited distortion, since dynamic people are traditionally challenging for Gaussian Splatting yet are central to many memories.

What we learned

We learned how to design systems with long-latency AI pipelines involving video generation, scene reconstruction, and personalization. We also gained a much deeper understanding of 3D Gaussian Splatting, particularly the importance of input consistency and data quality.

Most importantly, we explored how creative agents can personalize experiences rather than simply generate content.

What's next for Rem

We plan on expanding Rem to be able to traverse multiple memories as once by grouping them. For example, if someone went to Florida for vacation, they can upload their photos of the beach, the southernmost point of the continental US, and Disney World separately and then group them to be able to navigate from one scene to another.

Another large area we could go into is increasing shareability of memories. We will likely make Rem a platform where users can share their memories and information hotspots with other users. These other users can add their own memories to make more hotspots, turning it into a multi-layered reconstruction of one scene.

It's like Harry Potter's Pensieve, where multiple memories are being layered and pulled out of one's brain. Like Dumbledore said, "I sometimes find, and I am sure you know the feeling, that I simply have too many thoughts and memories crammed into my mind."

Rem gives those memories a place to live.

Built With

agents
anthropic
arize
claude
gaussian-splatting
gemini
gsplat
javascript
next.js
node.js
pika
python
react
redis
sharp
supabase
tailwindcss
three.js
typescript
web-speech-api

Submitted to

UC Berkeley AI Hackathon 2026

Created by

I worked on the backend and connected Redis, Supabase, and Arize to make sure our backend and database where connected correctly and efficiently.

Suhaan Khan
Built the video-to-Gaussian-splat pipeline by extracting video frames, using COLMAP to recover camera poses, and using the gsplat library to generate the Gaussian splat.

Yifan Luo
Worked on ideation, 3dgs training, and creative orchestration agent for video generation

Hailey Lin
Designed and built a clean frontend UI featuring a custom, responsive grid layout where user-generated memories are rendered as interactive tiles.

Eduardo Hernández