Inspiration

While working as a local travel companion in Sydney, I realized travelers don’t just want photos — they want to preserve how they felt: the sunlight on the Opera House sails, the rhythm of waves at Bondi, the eucalyptus scent after rain. Camera rolls capture scenes, but often miss emotion.

So I built Memory — an AI that transforms small travel moments into cinematic, poetic narratives, with a tone inspired by Wong Kar-wai’s mood and fragmented voiceover (a stylistic homage, not an endorsement).

What it does

Memory turns your three key moments into a complete “cinematic narrative poster”: Upload vertical photos (full display, no cropping) Describe three moments (location + vibe + a memorable detail) Generate via Gemini Flash (Gemini API): Bilingual cinematic titles (Chinese + English) Optional location stamps (GPS or city-level, privacy-aware) Fragmented cinematic narration + “golden lines” Smooth transitions across moments A dramatic curtain call Download the full poster or individual scene cuts

The result: a beautifully designed narrative that preserves not only what you saw, but how you felt.

How we built it

Frontend: React 19 + Vite (Rolldown) + Tailwind CSS v4 (oklab color space) AI Integration: Gemini Flash API Structured JSON Schema for consistent multi-part storytelling System Instructions to guide a cinematic voice True bilingual generation (not translation) Backend: Vercel Serverless Functions as a secure proxy (API key stays server-side) Export: Switched from html2canvas to html-to-image for modern CSS compatibility (e.g., oklab) Practical safeguards: CORS origin control, IP rate limiting, input validation, and cost/timeout controls

Data note: We aim to minimize retention; images are processed for generation/export and not used for model training (deployment-configurable).

Challenges we ran into

CSS compatibility: html2canvas couldn’t parse Tailwind v4’s oklab(), breaking downloads → Migrated to html-to-image and rewrote export logic

Vertical image cropping → Removed max-height/overflow constraints for full display

Deployment identity mismatch (team/personal Git author) → CLI deployment workflow + proper repo connection setup

Accomplishments that we're proud of

Real user impact: testers said it captured feelings they couldn’t easily express.

Aesthetic coherence: authentic cinematic vibe, not template-like output.

Reliable creative generation: schema + system instructions for consistency.

Strong engineering delivery: solved key blockers quickly and shipped a polished build.

What we learned

Structured outputs are essential for creative reliability Multilingual ≠ translation — cultural rhythm matters Modern CSS is powerful but demands end-to-end compatibility checks User needs beat flashy tech

What's next for memory

Multimodal story extraction, privacy-aware auto-location, more stylistic homages (Wes Anderson / Ghibli / Terrence Malick, etc.), audio narration, and a mobile app.

Why This Matters: I’ve seen the gap between emotional preservation and today’s photo-first tools. Memory’s mission is to elevate raw travel moments into vivid, cinematic memory vignettes — crafted through words and design.

Built With

  • gemini-3.0-flash-api
  • html-to-image
  • java
  • javascript
  • jszip
  • lucide-react
  • react-19
  • tailwind-css-v4
  • vercel-serverless-functions
  • vite
Share this project:

Updates