Inspiration
While working as a local travel companion in Sydney, I realized travelers don’t just want photos — they want to preserve how they felt: the sunlight on the Opera House sails, the rhythm of waves at Bondi, the eucalyptus scent after rain. Camera rolls capture scenes, but often miss emotion.
So I built Memory — an AI that transforms small travel moments into cinematic, poetic narratives, with a tone inspired by Wong Kar-wai’s mood and fragmented voiceover (a stylistic homage, not an endorsement).
What it does
Memory turns your three key moments into a complete “cinematic narrative poster”: Upload vertical photos (full display, no cropping) Describe three moments (location + vibe + a memorable detail) Generate via Gemini Flash (Gemini API): Bilingual cinematic titles (Chinese + English) Optional location stamps (GPS or city-level, privacy-aware) Fragmented cinematic narration + “golden lines” Smooth transitions across moments A dramatic curtain call Download the full poster or individual scene cuts
The result: a beautifully designed narrative that preserves not only what you saw, but how you felt.
How we built it
Frontend: React 19 + Vite (Rolldown) + Tailwind CSS v4 (oklab color space) AI Integration: Gemini Flash API Structured JSON Schema for consistent multi-part storytelling System Instructions to guide a cinematic voice True bilingual generation (not translation) Backend: Vercel Serverless Functions as a secure proxy (API key stays server-side) Export: Switched from html2canvas to html-to-image for modern CSS compatibility (e.g., oklab) Practical safeguards: CORS origin control, IP rate limiting, input validation, and cost/timeout controls
Data note: We aim to minimize retention; images are processed for generation/export and not used for model training (deployment-configurable).
Challenges we ran into
CSS compatibility: html2canvas couldn’t parse Tailwind v4’s oklab(), breaking downloads → Migrated to html-to-image and rewrote export logic
Vertical image cropping → Removed max-height/overflow constraints for full display
Deployment identity mismatch (team/personal Git author) → CLI deployment workflow + proper repo connection setup
Accomplishments that we're proud of
Real user impact: testers said it captured feelings they couldn’t easily express.
Aesthetic coherence: authentic cinematic vibe, not template-like output.
Reliable creative generation: schema + system instructions for consistency.
Strong engineering delivery: solved key blockers quickly and shipped a polished build.
What we learned
Structured outputs are essential for creative reliability Multilingual ≠ translation — cultural rhythm matters Modern CSS is powerful but demands end-to-end compatibility checks User needs beat flashy tech
What's next for memory
Multimodal story extraction, privacy-aware auto-location, more stylistic homages (Wes Anderson / Ghibli / Terrence Malick, etc.), audio narration, and a mobile app.
Why This Matters: I’ve seen the gap between emotional preservation and today’s photo-first tools. Memory’s mission is to elevate raw travel moments into vivid, cinematic memory vignettes — crafted through words and design.
Built With
- gemini-3.0-flash-api
- html-to-image
- java
- javascript
- jszip
- lucide-react
- react-19
- tailwind-css-v4
- vercel-serverless-functions
- vite
Log in or sign up for Devpost to join the conversation.