What it does

StillTale is an AI-powered video generation platform that transforms text prompts into complete videos with:

  1. Automatic story generation - Enter a simple prompt and AI expands it into a full narrative

  2. Character consistency - AI identifies characters and generates reference images to maintain visual consistency across scenes

  3. Scene-by-scene visualization - Stories are broken into scenes with AI-generated images

  4. Voiceover narration - Text-to-speech converts narration into audio Automatic video assembly - Images and audio are merged into a polished MP4 video

Users simply log in, enter a prompt like "A brave knight rescues a dragon from a princess," and receive a complete video within minutes.

How we built it

Backend (Python/FastAPI):

  1. FastAPI for REST API with JWT authentication

  2. Google Gemini AI for story generation, character identification, and scene creation

  3. Bria API for AI image generation (text-to-image and image-to-image)

  4. gTTS for text-to-speech narration

  5. OpenCV and FFmpeg for video assembly

Frontend (React):

  1. React with React Router for navigation

  2. Tailwind CSS for modern, responsive UI

Challenges we ran into

  1. Character consistency - Maintaining the same character appearance across multiple scenes was difficult. We solved this by generating character reference images first and using image-to-image generation for scenes.

  2. API rate limiting - The Bria image API has rate limits. We implemented retry logic with exponential backoff and polling for async image generation.

What's next for StillTale

Add unique voice for each characters

Built With

Share this project:

Updates