Inspiration

We all have hundreds of great photos sitting in our phones — but turning them into cinematic reels takes time, effort, and editing skills. Between choosing music, adding transitions, and syncing everything perfectly, creating a polished reel can be exhausting.

Memora was built to save that time. It automatically transforms your everyday photos into ready-to-post cinematic reels — complete with transitions, background music, and storytelling flow. In just seconds, it helps you relive and share your memories in a professional, emotional, and visually stunning way.

What it does

Memora turns your photos into cinematic reels automatically using a series of AI-powered steps:

🖼️ Photo Upload: Users upload their favorite photos.

🧠 Context Extraction: A fine-tuned BERT5 model analyzes the photos and extracts their emotional tone and context.

🔍 Prompt Matching: The extracted emotions are matched to the most fitting prompt from our RAG (Retrieval-Augmented Generation) database.

🎨 Video Generation: Using the generated prompt and photos, Gemini 2.5 Flash Image creates cinematic frames and visual effects.

🎬 Final Output: The visuals are compiled using MoviePy into a polished MP4 video — complete with transitions and background music.

How we built it

Frontend: Simple Flask web app for easy uploads and preview.

Backend: Python + Flask server handling all the AI inference and video generation.

AI Models:

BERT5 for photo sentiment and contextual embedding.

RAG database for dynamic prompt retrieval.

Gemini 2.5 Flash Image for creative video frame generation.

Video Rendering: MoviePy was used to stitch images, add transitions, and layer background music seamlessly.

Challenges we ran into

Integrating multiple AI models efficiently within limited GPU/CPU constraints.

Getting BERT5 outputs to meaningfully match RAG prompts in real-time.

Ensuring Gemini’s image outputs were consistent across various emotions.

Managing large model downloads and caching on different systems.

Synchronizing music and transitions with the emotional tone of each photo.

Accomplishments that we're proud of

Built a fully working pipeline connecting NLP, vision, and video-generation models.

Successfully created AI-driven cinematic videos that actually reflect photo emotions.

Designed a scalable architecture that can be extended to different storytelling themes.

Learned to orchestrate multiple AI tools — from Transformers to Gemini — in a single workflow.

What we learned

How to merge text-based AI (BERT, RAG) with visual AI (Gemini) for a unified creative output.

The importance of prompt engineering and contextual retrieval.

How to manage and optimize large model dependencies in Python applications.

That even small design tweaks — like the right background music — can make a big emotional impact.

What's next for Memora

Adding user voice narration generated from photo captions.

Integrating music mood detection for more personalized soundtracks.

Allowing real-time previews and social media sharing.

Building a mobile version so users can create cinematic stories directly from their phones.

Expanding the RAG prompt database with thousands of emotion-driven scenarios.

Built With

  • bert5
  • gemini
  • moviepy
  • rag
  • transformers
Share this project:

Updates