🎯 The Problem & Why It Matters

  1. 42.3% of filmmakers still rely on manual storyboards, a process that is slow, costly, and often requires specialized artistic skills.
    (Source: pzaz.io)

  2. Traditional video production is prohibitively expensive for small teams and solo creators, requiring large crews and long workflows.
    (Source: elai.io)

  3. AI creative tools today are fragmented — you must juggle scripting tools, consistency tools, image models, and video editors separately.
    (Source: prst.media)

AI Studio Director solves these problems by providing a unified, intelligent, end-to-end filmmaking pipeline — reducing cost, time, and complexity for creators.


🚀 What We Built

AI Studio Director is an AI production studio that transforms a text script into a visually consistent storyboard — and then into a full cinematic music video.
Live Demo: studiodirector.streamlit.app

It includes:


🎬 Multi-Agent Filmmaking Pipeline

  • Creative Director Agent — breaks scripts into structured shots
  • Cinematography Agent — assigns camera, lens, lighting & composition
  • Continuity Agent — ensures character + environmental consistency
  • QC Agent — flags errors, issues, or style mismatches
  • Reviewer Agent — synthesizes a clean production-ready summary

🧠 FIBO (Film Blueprint) Structured Prompting

A powerful JSON schema encoding:

  • camera/lens geometry
  • composition and framing
  • lighting blueprints
  • film stock palettes
  • mood and environmental cues
  • actor/prop/material continuity

This enables deterministic, controllable AI cinematography.


🖼️ AI Image Generation + Enhancement

  • BRIA FIBO for high-control, JSON-native image generation
  • Enhancer, Upscale, RMBG 2.0, and background replacement
  • Local fallback upscaler for offline dev
  • Per-shot comparison and continuity diagnostics

🧩 BRIA Models Used

Model Description
BRIA-3.2 Primary high-quality image generation model used for FIBO → Image workflow.
BRIA-3.2-ControlNet-Union Canny / Depth / ColorGrid controllability; tested extensively on Google Colab.
BRIA RMBG 2.0 Background removal model; integrated into Streamlit Asset Lab & FastAPI tools.
BRIA Upscaler 2×–4× upscaling via BRIA API; includes local fallback enhancer when offline.
BRIA Inpainting Model Used for object removal / restoration tests; experimental support in pipeline.
ControlNet XL Variants Attempted; high RAM usage & missing safetensors files.
Multi-ControlNet Pipelines Pose + Canny + Depth stack attempts

😁 Character Continuity

  • Gemini 2.5 Pro Vision annotates each frame for:
    • bounding box
    • face & clothing traits
    • pose, props, and expressions
    • continuity descriptors

This enables true multi-shot character consistency.


🎥 Music Video Rendering

  • Automatic mv_video_plan.json generation
  • LongCat / Fal.ai integration for image-to-video generation
  • Produces a complete music video from the storyboard keyframes

🧩 ComfyUI Export

Export the entire storyboard + FIBO JSON into a ready-to-run ComfyUI graph:

  • Perfect for downstream editing
  • Integrates with ControlNet, LoRA, VFX pipelines
  • Ensures reproducible, controllable workflows

🖥️ Streamlit Production Dashboard

A full studio UI where creators can:

  • Generate the storyboard
  • Inspect continuity
  • Edit or regenerate shots
  • Preview annotated frames
  • Export ComfyUI graphs & asset packs
  • Render the music video end-to-end

🛠️ How It Works

  • FastAPI backend orchestrates agents, FIBO generation, and rendering
  • Streamlit UI provides an interactive creative environment
  • FIBOBriaImageGenerator ensures JSON → image determinism
  • Gemini Annotator adds character continuity metadata
  • LongCat video backend converts keyframes into cinematic clips
  • Modular architecture allows easy model swapping or extension

Built with: Python · FastAPI · Streamlit · BRIA AI · Gemini Vision · PIL · Fal.ai LongCat · ComfyUI node graph export · JSON DSLs


💡 What We Learned

  • Designing a structured film language (FIBO) dramatically improves AI control
  • Multi-agent orchestration yields more consistent results than single prompts
  • Robust fallbacks are essential when using multiple third-party AI services
  • Creators prefer intuitive, editable JSON controls over opaque prompts
  • Video generation quality depends heavily on consistent lighting + continuity

🏆 Impact

AI Studio Director empowers:

  • indie filmmakers
  • musicians
  • students
  • content creators
  • small studios

By automating pre-production, visual continuity, and video generation, it lowers the barrier to high-quality storytelling.


🔧 Tech Stack

  • Python
  • FastAPI
  • Streamlit
  • BRIA AI (FIBO image generation, Background Enhancer, RMBG, Upscale)
  • Gemini 2.5 Pro Vision (continuity annotation)
  • Fal.ai LongCat / AI Video Rendering
  • PIL + custom enhancement pipeline
  • ComfyUI graph exporter
  • Multi-agent orchestration system

🙏 Acknowledgements

Huge thanks to the hackathon organizers, mentors, and sponsors — especially BRIA, NVIDIA, Fal.ai, and Google DeepMind Gemini — for supporting innovation in AI filmmaking.
Your tools enabled this project to come to life.

Built With

Share this project:

Updates