Inspiration

The spark came from watching Avatar: the last Airbender movie and reminiscing how some classic scenes from the animation were changed. I kept wondering since Ai can generate realist videos, why cant 2d cartoons be turned into live action while staying true to the script.

What it does

I am building a mini creative production, giving anyone the power to go from "what if..." to a complete short-film package (script, storyboard, voiceovers, edits) in minutes, not months. Gemini 3's release felt like the perfect moment: its massive context, multimodal reasoning, planning, and self-correction capabilities could finally make an end-to-end autonomous creative agent possible.

How we built it

I Started with Google AI Studio for rapid prototyping, then moving to a full repo using Python/GO + the Gemini SDK, LangGraph for the agent loop, Streamlit for a clean web UI, and integrated free TTS/image APIs for voiceovers and visuals and Veo for video generation. Core logic: a planner prompt → execution agents → evaluator → iterate or user feedback.

Challenges we ran into

Biggest hurdles were prompt engineering for reliable creative consistency and managing API rate limits during long workflows. Debugging self-correction loops was tricky but taught me how powerful iterative reasoning can be.

Accomplishments that we're proud of

We have a functioning prototype capable of generating scripts, storyboard, voiceovers and executes steps with self-correction and user feedback loops.

What we learned

I learned Gemini 3 excels at complex, multi-step orchestration when given clear tools and evaluation criteria, unlocking workflows that feel almost human.

What's next for Enno Creative Orchestrator

Proceeding further with development and testing to ship a production ready MVP in the next couple of months

Built With

Share this project:

Updates