Inspiration

I was inspired by a fundamental bottleneck in cinematic and architectural visualization: the tradeoff between speed and physical accuracy. Directors and designers need to iterate ideas in minutes, but building even a rough 3D scene still requires manually setting up cameras, adjusting lighting, finding high-quality backplates, and making everything feel grounded. That process takes hours.

At the same time, AI image generation has become incredibly powerful, but it remains spatially unaware. It can generate beautiful frames, yet it does not understand camera lenses, sun angles, or how an image should integrate into a real 3D viewport.

My goal with Previz was to bridge this gap. Instead of using AI as a static image generator, I wanted to turn it into an agentic Director of Photography that understands cinematic intent, rigs a physical 3D scene, and supports rapid, director-style iteration.

What it does

Previz is a FIBO-powered Blender add-on that acts as an autonomous Director of Photography, transforming natural language into a physically grounded 3D scene rather than a static image.

From a single line of text such as “Wide shot, low angle, sunset over a cityscape,” Previz generates a camera-ready 3D setup inside Blender, with correct lensing, lighting, and compositing-ready backgrounds.

The core idea behind Previz is Text-to-Physics with iterative refinement.

When a prompt is submitted, Previz uses Bria’s FIBO model to translate cinematic language into structured, JSON-native scene parameters describing shot type, camera angle, field of view, lighting direction, softness, and mood. This structured output becomes the authoritative scene state. Using Blender’s Python API, Previz physically rigs the camera and lighting, then orchestrates a second FIBO call to generate a 16-bit EXR environment backplate, mapped flat and perspective-correct in the World node tree.

Previz also supports foreground elements, allowing users to generate specific props or characters through FIBO, automatically remove their backgrounds, and insert them into the scene as camera-aligned cards that integrate naturally with lighting and shadows.

Once the scene exists, Previz supports Refine Mode, enabling natural language adjustments such as “make the lighting warmer” or “change to a low angle shot.” Instead of regenerating the scene, Previz applies only the relevant JSON updates while preserving the camera framing, background, and placed assets, mirroring how a real Director of Photography iterates on set.

Finally, Previz supports Export to Set, serializing the camera and lighting parameters into structured JSON so the setup can be reused in other tools or referenced on a real production set.

This makes Previz a stateful, controllable, production-oriented workflow that bridges generative AI and real 3D previsualization.

How I built it

Prompt parsing and rigging A custom Blender add-on written in Python uses Bria’s FIBO model to translate cinematic language into structured, JSON-native scene parameters. Shot descriptions are converted into explicit values such as focal length, camera position and rotation, and lighting direction and softness. These parameters become authoritative scene state and are mapped directly to Blender’s physical camera and lighting controls using the bpy API.

API orchestration with refinement awareness The structured scene parameters produced by FIBO are stored and reused across generations. For initial scene creation, the parsed JSON is used both to rig the Blender scene and to drive a FIBO call that generates a 16-bit EXR environment backplate. For refinement, the existing JSON scene state is combined with a natural language instruction and sent back to FIBO, allowing the model to return minimal, targeted JSON updates rather than regenerating the scene from scratch.

For foreground assets, FIBO is used to generate prop imagery, which is then automatically passed to Bria’s RMBG 2.0 API. The resulting transparent images are downloaded asynchronously and inserted into Blender as camera-aligned cards.

Compositing automation Using Blender’s Python API, Previz programmatically constructs the World node tree, mapping the EXR image with Window coordinates so it behaves as a true, perspective-correct backplate. Shadow catcher planes are created and configured automatically to ground 3D objects into the AI-generated environment.

Challenges I ran into

Preserving perspective between iterations Maintaining geometric consistency between a 2D AI-generated image and a 3D camera was critical, especially across refinements. This was solved by locking the background to the camera using Window coordinates rather than environment mapping.

Accomplishments that I'm proud of

Deterministic cinematic control Text directly maps to physical camera and lighting parameters, not just visual style.

True refine mode I built a system where users can iteratively adjust a scene using natural language without resetting the camera, lighting, or placed assets.

Production-oriented output EXR backplates, shadow catchers, and transparent foreground elements make the output immediately usable in real previz and VFX workflows.

What I learned

I learned that the real power of generative AI for professional tools comes from FIBO’s structured, JSON-native approach, not from one-off free-form prompts. FIBO’s ability to translate cinematic language into explicit, machine-readable parameters made it possible to build a stateful, controllable, and iterative workflow rather than a purely generative image pipeline.

By treating FIBO’s structured output as authoritative scene state, I was able to support refinement, takes, and deterministic control over camera and lighting, which is essential for real production workflows.

I also gained deep hands-on experience with Blender’s Python API, particularly around camera math, world shader construction, asynchronous execution, and advanced compositing techniques such as shadow catchers.

What's next for PreViz

Depth-aware refinement

Leveraging FIBO’s depth outputs to introduce limited parallax and proxy geometry, allowing subtle camera movement while preserving refine-mode compatibility and scene consistency.

Richer asset intelligence

Extending the Prop Shop to use FIBO-driven scene context to reason about scale, placement, and interaction of assets, rather than treating foreground elements as simple insertions.

Refine-first UI

Designing the interface around FIBO-powered refinement, making iterative, parameter-level control the primary interaction model, closer to how directors and cinematographers give notes on set.

Built With

Share this project:

Updates