Lumina-Agent

Gradio Interface

Inspiration

As 3D artists and VFX professionals, we've always faced a bottleneck: creating HDR environment maps requires expensive equipment, manual bracketing photography, or rendering the same scene multiple times at different exposures. We asked ourselves: "What if AI could intelligently 'rephotograph' a scene at different exposures, just like a real photographer brackets shots?"

The inspiration came from traditional computational photography (Debevec's HDR work from the 90s) meeting modern generative AI. We realized BRIA's structured prompt API was the missing link—it could understand and manipulate lighting conditions with precision, enabling virtual bracketing that was previously impossible.

What it does

Lumina-Agent is an autonomous three-agent system that converts a single 3D render into a production-ready 32-bit HDR (.exr) lighting asset:

Scout Agent - Analyzes your input image and extracts "Scene DNA" (camera parameters, geometry, lighting direction) into a structured JSON blueprint using BRIA's image-to-structured-prompt API.
Director Agent - Takes that JSON and intelligently modifies only the lighting conditions to generate three deterministic exposures (-2EV, 0EV, +2EV) using BRIA's structured-prompt-to-image generation with a locked seed.
Fusion Reactor - Applies the Debevec HDR merge algorithm to combine the bracketed exposures into a 32-bit floating-point EXR file with physics-compliant light values.

Users can preview the HDR with three tonemapping algorithms (Drago, Reinhard, Mantiuk) and download the professional asset for use in Blender, Unreal Engine, Unity, or any production pipeline.

How we built it

Core Architecture:

Frontend: Gradio for real-time agent status streaming and interactive previews
Vision Pipeline: BRIA's V2 Structured Prompt API for bidirectional image↔JSON translation
Computational Photography: OpenCV's Debevec algorithm for HDR merging
Workflow Orchestration: Python async job polling with state management

The JSON Innovation: The breakthrough was realizing BRIA's structured prompts aren't just "better prompts"—they're programmable scene descriptors. Our Director Agent performs surgical JSON manipulation:

# Load the master scene JSON from Scout Agent
payload_structure = json.load(master_scene.json)

# Modify ONLY the lighting.conditions field
payload_structure['lighting']['conditions'] += " Overexposed, +2EV"

# Re-generate with locked seed = perfect geometric consistency
response = bria_api.generate(structured_prompt=json.dumps(payload_structure), seed=12345)

This surgical precision is impossible with text prompts—you'd get random variations in camera angle, object position, and geometry. JSON-native control ensures deterministic bracketing.

Tech Stack:

BRIA API (structured_prompt/generate + v2/image/generate)
OpenCV (HDR merge, tonemapping)
NumPy (exposure time arrays, floating-point math)
Gradio (streaming UI, file handling)

Challenges we ran into

The "Geometric Drift" Problem
- Challenge: Initial text prompt modifications caused camera angles and object positions to shift between exposures, breaking HDR alignment requirements.
- Solution: Discovered that BRIA's structured prompts with locked seeds maintain perfect geometric consistency while allowing lighting variations—this became our core innovation.
Exposure Control Without Destroying Color
- Challenge: Simply adding "dark" or "bright" to prompts created color shifts and mood changes, not true exposure bracketing.
- Solution: Engineered specific exposure modifiers (-2EV, +2EV) combined with technical lighting terms (underexposed, low key lighting, dim environment) to simulate f-stop changes while preserving color fidelity.
32-bit EXR Codec Support
- Challenge: OpenCV requires explicit EXR support enabled, which isn't obvious—spent hours debugging "silent failures."
- Solution: os.environ["OPENCV_IO_ENABLE_OPENEXR"] = "1" must be set before importing cv2. Documented this as the first line in our code.
Async Job Management
- Challenge: Coordinating three simultaneous image generation jobs while providing real-time UI feedback.
- Solution: Implemented polling with job tracking dictionaries and Gradio's yield-based streaming for live agent status updates.

Accomplishments that we're proud of

🏆 First True "Computational Bracketing" System Using GenAI - We proved that structured prompts can enable virtual photography techniques that mirror real-world professional workflows.

🏆 Production-Ready Output - Not just a demo—outputs genuine 32-bit HDR files used by professional rendering engines. Pixel values exceed 1.0 (proof of HDR dynamic range).

🏆 Surgical JSON Manipulation - Demonstrated that BRIA's structured prompts are programmable scene graphs, not just better text prompts. Our Director Agent modifies one JSON field while preserving 20+ others with perfect fidelity.

🏆 Zero Manual Intervention - Full autonomous agent workflow from single image → HDR asset in ~60 seconds.

🏆 Educational Impact - The tonemapping comparison visually demonstrates why HDR matters, making computational photography concepts accessible to artists.

What we learned

Technical Insights:

Structured prompts are scene graphs, not strings - JSON-native APIs enable parametric control impossible with text
Seed determinism + structured modifications = virtual camera control - You can "re-shoot" a scene with different lighting while maintaining perfect alignment
HDR requires <1% geometric variation - Even tiny shifts break the Debevec algorithm—structured prompts solved this

Design Insights:

Agentic workflows need personality - Our Scout/Director/Fusion metaphor makes the pipeline intuitive
Real-time feedback is critical - Streaming agent logs transformed the UX from "black box" to "collaborative process"
Professional output validates innovation - Downloadable .EXR files prove this isn't just a hackathon gimmick

API Insights:

BRIA's bidirectional image↔JSON pipeline is underutilized—most people only use text-to-image
The "LLM translator" for structured prompts is more powerful than direct JSON writing—it handles edge cases intelligently
Async job polling with status URLs is production-grade architecture (not webhook-dependent)

What's next for Lumina-Agent

Short-term (Production-Ready Features):

Multi-image HDR stitching - Generate 360° HDRI environment maps from single concept art
Batch processing - Process entire asset libraries overnight
Custom exposure bracketing - Let users define EV ranges (5-shot, 7-shot brackets)
ACES color space support - Industry-standard color workflows for VFX pipelines

Medium-term (Advanced Agent Capabilities):

Reflection removal agent - Detect and separate direct lighting from reflections for IBL workflows
Automatic ground plane detection - Generate proper shadow-catching planes for compositing
Material decomposition - Separate albedo, roughness, and lighting into PBR texture maps
Quality validator agent - Detect alignment issues, clipping, or artifacts before HDR merge

Long-term (Ecosystem Integration):

Blender/Unreal/Unity plugins - One-click HDR generation from viewport renders
Real-time preview - Live tonemapping during generation (streaming partial results)
Style-consistent HDRI libraries - Generate matching HDR sets for entire projects
Collaborative agent swarms - Multiple Scout agents vote on best scene interpretation

Research Direction: Explore using structured prompts for inverse rendering—given a lit scene, decompose it into geometry + lighting + materials using agent negotiation. This would be the "holy grail" of computational photography meeting GenAI.