INFINITUM — AI Creative Director

INFINITUM generating a complete brand campaign in real time — Strategy, Visuals, Video, and Audio streamed live.
INFINITUM System Architecture — Google ADK agent orchestrating Gemini 2.0, Veo 2, Google TTS, and Cloud Run.
Campaign Score Card — AI self-evaluation with 3 animated gauges: Creativity, Brand Coherence, and Market Impact.
Real-time Google Cloud Run logs — ADK agent orchestrating Gemini, Veo 2, and Google TTS live in production.

Inspiration

Every brand campaign you see took weeks, a team, and thousands of dollars. I asked myself — what if one autonomous AI agent could do all of that in 60 seconds? That question became INFINITUM.

What it does

INFINITUM is an autonomous AI Creative Storyteller Agent that breaks the text box paradigm. You give it one brand brief — and it builds a complete brand narrative in real time.

Live Demo: https://infinitum-55416250757.europe-west1.run.app/

In under 60 seconds, INFINITUM generates 6 complete campaign sections — Strategy, Headlines, Visuals, Social, Video, and Audio — all streamed live, section by section. An AI Brief Builder chatbot guides users through 5 questions to craft the perfect brief. A Tone of Voice selector adapts the entire campaign to 4 creative modes: Luxury, Bold, Minimal, or Direct. A Campaign Score Card evaluates Creativity, Brand Coherence, and Market Impact automatically. Everything downloads in one ZIP — PDF with images, MP4 video, MP3 audio, and JSON data.

How we built it

The core is a Google ADK agent called infinitum_creative_director. It uses FunctionTools — analyze_brief() and get_section_strategy() — to extract brand keywords and build a creative strategy for every section before a single word is generated.

Then it orchestrates 4 Google AI models:

Gemini 2.0 Flash — full campaign text (~9s)
Gemini Interleaved Output — 6 context-aware brand visuals (~5s each)
Veo 2 — cinematic brand video via Vertex AI (~40s)
Google TTS Journey-F — professional voiceover (~2s)

The backend is FastAPI with Server-Sent Events streaming content live to the browser. Everything runs on Google Cloud Run with assets stored in Google Cloud Storage.

Challenges we ran into

Veo 2 is a Long Running Operation — required polling Vertex AI every 6 seconds until the video was ready
SSE buffer flushing — FastAPI buffers events, fixed with keep-alive SSE comments to force real-time delivery
Region constraints — Veo 2 only available in us-central1 while Cloud Run runs in europe-west1
Signed URLs — Cloud Run service account cannot sign URLs, solved with make_public() for generated assets
ADK FunctionTool — _func attribute not available in ADK 1.26.0, solved by calling Python methods directly

Accomplishments that we're proud of

Built a complete multimodal AI product solo in 3 weeks
Successfully integrated Veo 2 — one of the most advanced video generation models available
Created a truly autonomous ADK agent that orchestrates 4 AI models simultaneously
Achieved real-time streaming of text, images, video, and audio in a single fluid experience
Built a product that is genuinely useful for real marketing professionals today

What we learned

Google ADK's Agent and FunctionTool make AI orchestration genuinely powerful and structured
Gemini's interleaved output is a game-changer for creative multimodal applications
Veo 2 produces surprisingly high-quality brand-appropriate video from short prompts
SSE streaming transforms user experience — live generation feels magical compared to waiting
Context-aware prompts are the difference between generic AI output and truly branded content