LumenSet

Live working site image

💡 Inspiration

In many computer vision workflows, datasets are costly to create and difficult to update. Even small changes in camera angle, lighting, or object design often require regenerating large portions of data.

Most synthetic image tools rely on text prompts, which makes precise control and reproducibility unreliable. Bria FIBO’s JSON-native generation offers a different model: explicit, structured parameters for camera, lighting, and composition.

LumenSet was built to make dataset generation programmable, reproducible, and easy to iterate on, turning synthetic data into an engineering workflow rather than a manual process.

🎯 What LumenSet Does

LumenSet is a structured synthetic dataset generator for computer vision and ML teams, built entirely around FIBO’s JSON-native workflow.

🎛️ Precision Parameter Control

Camera: rotation, tilt, zoom
Lighting: direction, hardness, color temperature
Environment: background materials, surface finishes, focal length
Materials & Composition: texture, imperfections, mood, framing rules Every parameter is explicit, editable, and deterministic.

🔄 Dual Generation Modes

Auto Sweep: generate all combinations automatically (e.g., rotations × tilts × lighting)
Manual Queue: hand-pick specific views for targeted datasets

🔒 100% Reproducibility

Seed-locked generation guarantees identical outputs
Every image exports with full JSON metadata
Any image can be recreated pixel-for-pixel using its seed + structured prompt

🔬 Disentanglement Proof

LumenSet visually proves FIBO’s unique capability:

Same object
Same seed
Same materials
Only one parameter changes (e.g., camera angle)

This level of isolation is impossible with traditional prompt-based models.

📦 ML-Ready Export

ZIP with images and per-image JSON metadata
Dataset manifest with overview and statistics
Reproduction instructions included

Built for researchers, not just demos.

🛠️ How I Built It

Architecture Choice: Vanilla JavaScript, HTML, CSS

Key Technical Highlights

Seed locking strategy to preserve object identity across variations
Structured prompt manipulation using multi-field reinforcement to ensure reliable camera and lighting control

🧗 Challenges & Breakthroughs

1️⃣ Camera Angle Control

Early results were inconsistent—sometimes the object rotated, sometimes the camera moved, sometimes nothing changed. After deep experimentation, I discovered that reinforcing the same parameter across multiple structured fields produces consistent, deterministic behavior. This undocumented insight became the backbone of LumenSet’s reliability. Time spent: ~16 hours
Outcome: solid camera disentanglement

2️⃣ Async Polling Without UI Freeze

FIBO’s generation is asynchronous. LumenSet uses non-blocking polling with live progress feedback, keeping the interface responsive throughout long batch jobs.

3️⃣ ML-Friendly Metadata Design

I interviewed ML engineers and asked: “What makes you trust a dataset?” The result:

Enough metadata to reproduce and debug
Enough structure to filter, sort, and analyze
Nothing unnecessary Every field in the export serves a real research purpose.

📚 What I Learned

1. JSON-Native Generation Is the Future

Text prompts are ambiguous.
Structured generation is deterministic, debuggable, and automatable. This is the difference between:

Natural language guessing
vs
Programmatic control

2. Reproducibility Is a Superpower

Seed-based regeneration enables:

Scientific reproducibility
Controlled A/B experiments
Dataset debugging and auditing This is impossible with most mainstream image models.

🏆 Why LumenSet Wins

Perfect Fit for JSON-Native Workflows

LumenSet doesn’t just use FIBO—it demonstrates why FIBO is different:

Structured prompt generation
Programmatic parameter control
True disentanglement
Full JSON export for automation

Innovation: Disentanglement Proof

Side-by-side visual proof that only one parameter changed.
This is something judges can see instantly—and something prompt-based models cannot do.