Inspiration

We’ve all seen how “prompting an image model” can feel like rolling dice: great when it works, painful when you need a very specific, on‑brand result. ChromaGen was born from that frustration. We wanted a way for brand and creative teams to describe what they need once, in their own language, and then have the system handle the messy details—palette, gaze, structure—without constant tweaking.

The FIBO model and its pro‑grade controls were the perfect canvas. Our goal was to show what happens when you stop treating prompts as throwaway text and instead turn them into a durable, JSON‑based contract for every image the system produces.

What it does

ChromaGen turns a brand brief into a JSON “source of truth” that drives a full agentic workflow around FIBO. From a single request (plus optional reference images), it:

  • Builds a structured JSON prompt capturing palette, mood, and layout.
  • Uses that JSON to steer FIBO and a LoRA‑enhanced color pipeline.
  • Automatically checks each image for brand palette coverage and gaze direction.
  • Surfaces clear feedback and overlays so teams can quickly see what passed, what failed, and why.

Instead of hand‑policing every asset, teams get a stream of images that are generated, evaluated, and auto‑corrected until they look and behave like the brief promised.

How we built it

On the backend, we wired a LangGraph‑style agent on top of FastAPI. The agent coordinates a set of tools: one that talks to BRIA’s FIBO endpoint and our LoRA pipeline, another that generates color palettes, one that verifies palettes in finished images, and a gaze tracker that runs through an MCP server.

For JSON prompt creation, we leaned on Gemini to turn natural language (and optional reference images) into structured prompts that FIBO and the LoRA pipeline can consume directly. We store state and history per chat/thread, so every run is reproducible and auditable. On the frontend, a lightweight Vite/React app lets you chat with the agent, preview outputs, and inspect compliance overlays without ever seeing a raw API call.

Challenges we ran into

Coordinating so many moving parts—LLM agents, BRIA/FIBO, LoRA weights, gaze MCP, and the frontend—was a real orchestration challenge. Getting all the services to agree on formats (especially JSON schemas and image paths) took more iteration than expected.

We also had to carefully manage when and how the agent called tools: one extra generate_image call could explode latency and cost. And making gaze and palette checks feel “reliable enough” in real images, not just perfect demos, forced us to tune thresholds and failure behavior instead of just trusting defaults.

Building the training dataset was its own multi-stage pipeline challenge: color quantization, JSON generation via Gemini-2.5-flash, and per-object palette extraction with SAM3. Each stage had to output perfectly compatible formats - any mismatch in palette structures or JSON schemas would break downstream processing. With only ~1500 images and limited compute, we couldn't afford reprocessing when formats changed. The trained LoRA didn't always match palettes exactly, so we added post-processing recoloring to enforce strict adherence. See the dataset creation repository for full details.

Accomplishments that we're proud of

We’re proud that ChromaGen feels like a production‑ready pattern, not just a demo. It:

  • Treats JSON as the contract between humans, the agent, and FIBO.
  • Uses LoRA color control in a way that’s directly tied to brand palettes, not just aesthetics.
  • Adds automated gaze and palette checks that actually catch off‑brand outcomes.
  • Wraps everything in a simple, friendly UX that hides the complexity behind one conversational entrypoint.

Most of all, we’re excited that a brand or creative team could realistically drop this into their workflow and see immediate value.

What we learned

We learned that the real power of FIBO isn’t just in the model itself, but in how you wrap it with structure and agents. When you move from “ad hoc prompts” to “stable JSON schemas,” everything gets easier—logging, debugging, replaying, and scaling across campaigns.

We also saw firsthand how important evaluation tools are. Gaze tracking, palette checking, and LoRA‑based corrections turn generative AI from a creative toy into something that can actually respect brand rules and reduce manual review time.

What's next for ChromaGen

Next, we’d like to deepen the JSON contract: add richer brand rules (logo usage, safe zones, backgrounds), support more nuanced gaze and pose constraints, and plug in additional quality checks like text legibility or accessibility.

We also see a clear path to team‑level features—workspaces, approvals, and analytics that show which briefs, palettes, and layouts perform best over time. Long term, we want ChromaGen to be the “brand sidecar” for FIBO: a JSON‑native agent that quietly guarantees every image is both beautiful and on‑brand.

Built With

Share this project:

Updates