prompt was neon shoes, but reference shoes were blue.
eiffel tower background
poster generated for sale of lipsticks at maybelline

💡 Inspiration

BrandDiffusion was born from a problem I repeatedly saw in the creator economy:
"Blank Canvas Paralysis" combined with "Brand Safety Anxiety."

Small business owners and marketers using tools like Adobe Express want fast, high-quality visuals, but they also need:

Brand accuracy
Product fidelity
Cultural relevance
Immediate usability

Generative AI is powerful, but it is unpredictable. If a user asks for "Diwali Sale", most models generate generic lights and fireworks while missing cultural elements like diyas and rangoli. If they ask for "Launch our new sneakers", the product shape often changes, logos distort, and hands appear incorrectly holding objects.

I wanted to build a system that does not just generate images, but understands Marketing Intent.
A system that knows the difference between:

A Product Launch → product accuracy is critical
A Festival Greeting → emotion and cultural aesthetics matter
A Sale Campaign → urgency and clarity matter

and enforces brand compliance so the output is production-ready.

⚙️ How I Built It

BrandDiffusion is a modular, multi-stage pipeline written in Python and designed as a backend for an Adobe Express Add-on.
It follows a Generate → Verify → Composite architecture.

Instead of returning one flat image, it outputs an editable design system:

Background layer
Subject (person) layer
Hero product layer
Logo layer
Text and layout metadata

Each asset is exported separately so Adobe Express can treat them as independent, editable layers.

🧠 AI Brain – Intent & Use Case Detection

Before any generation, I analyze the user's marketing intent using Llama-3 (via Groq):

$$P(\text{intent}) \in {\text{Product Launch},\ \text{Festival},\ \text{Sale},\ \text{General}}$$

This decision controls:

Whether the exact product must be preserved
Whether festival objects should replace retail products
How much creative freedom the background gets
What kind of marketing copy is generated

For example:

"Happy Diwali wishes" → diyas, rangoli, festive decor
"Christmas sale on shoes" → shoes, not ornaments

This layer prevents most AI hallucination errors.

🎭 Event Booster Engine

The Event Booster is the heart of BrandDiffusion V39:

Event Priority Boost

Festival keywords (Diwali, Christmas, Eid) are strongly emphasized in prompts.

Background Freedom Mode

For festivals:

$$\text{ControlNet Scale} = 0.05$$

This allows creative generation of lights, fireworks, and decorations.

Cultural Context Injection

The system auto-injects:

Traditional clothing
Festival lighting
Region-appropriate decor

So users never need expert prompt engineering.

🖼 Diffusion Factory – Controlled Visual Synthesis

I use Stable Diffusion XL + ControlNet (Canny) with adaptive control:

Use Case	ControlNet Scale
Festival	0.05
Sale	0.60
Product Launch	0.40

This creates a balance:

$$\text{Creativity} \uparrow \text{ for emotional campaigns}$$

$$\text{Fidelity} \uparrow \text{ for product campaigns}$$

🧍 Human Refinement Pipeline

AI often fails on faces and hands. I solved this using:

MediaPipe → hand landmark detection
GFPGAN → face restoration
SDXL inpainting → local corrections

$$\text{Mask}_{\text{hands}} = \text{Dilate}(\text{MediaPipe}(\text{Image}))$$

Only damaged areas are refined, keeping realism intact.

📐 PosterLLaMA – Professional Layout Intelligence

I trained a PosterLLaMA-style layout model using:

https://huggingface.co/datasets/creative-graphic-design/PKU-PosterLayout

This dataset contains professionally designed posters with bounding box annotations.
The model learns real graphic design principles and outputs bounding boxes for:

Title
Subheading
Product
Logo
Call-to-Action

This turns AI images into designer-quality compositions.

✍️ Text Rendering Engine

Diffusion models are unreliable for text, so I removed text generation entirely from AI.

Workflow:

PosterLLaMA predicts layout boxes
LLM generates grounded marketing copy
Python Pillow renders text inside boxes

Guaranteeing:

Zero spelling mistakes
Brand-safe typography
Print-ready quality

🧩 Layered Output for Adobe Express

Each poster exports:

layer_1_background.png
layer_2_subject.png
layer_3_hero.png
layer_4_logo.png
response.json

This allows users in Adobe Express to:

Move, rotate, resize elements
Change product color
Replace backgrounds
Swap logos
Animate components

BrandDiffusion does not generate an image.
It generates an editable design file.

🎯 Exact Product Preservation Mode

If the user requests:

"Use the exact product from my reference image"

The system:

Skips AI generation
Uses the uploaded product directly

Guaranteeing:

$$\text{Product Fidelity} = 100\%$$

This is critical for e-commerce brands.

🌍 Context Switching

Users can re-imagine any product anywhere:

"Show this Nike shoe in a desert at sunset"

The product remains identical.
Only the background is regenerated.

This makes BrandDiffusion a visual recontextualization engine.

🧠 What I Learned

Pipeline Engineering > Prompt Engineering
Brands need trust more than creativity
Design requires layout control, not just image generation
Marketing AI must understand intent before creating visuals

🚧 Challenges I Faced

Floating Product Problem

Solved using depth-aware compositing and lighting alignment.

Correctly identifying which bounding box needs to fit which model

Hand Deformation

Solved using MediaPipe + targeted inpainting.

Festival vs Product Confusion

Solved by strict AI intent classification and rule enforcement.

🚀 Final Thought

BrandDiffusion transforms Generative AI from an image generator into a real marketing design engine.

It is:

Editable
Layer-based
Brand-safe
Culturally aware
Marketing-intelligent

It does not generate images.
It generates campaign-ready design systems.

✅ Credibility & Current State

BrandDiffusion is not a concept project. It is already a working, end-to-end system.
The full pipeline exists:

Intent understanding
Reference image analysis
Diffusion-based generation
Layer separation
Adobe Express–ready outputs

It already generates:

Background
Hero product
Subject
Logo
Text layout metadata

as separate, editable layers.

This proves BrandDiffusion is not experimental.
It is production-ready design infrastructure.

🔥 Why BrandDiffusion & Call to Action

Most AI tools stop at image generation.
BrandDiffusion goes further and creates editable marketing designs.

It stands out because it:

Understands marketing intent
Preserves exact products
Allows creativity with control
Fits directly into Adobe Express workflows

BrandDiffusion is not a feature.
It is a new design paradigm.

It bridges AI creativity with real-world branding.
This is where AI becomes usable for serious marketing.

🚀 How We Ship to Market

BrandDiffusion will ship using:

High-performance GPU servers for fast diffusion inference
A scalable backend for multiple users
Adobe Express as the editing and distribution layer

This allows:

Fast generation
Reliable performance
Brand-safe workflows
Easy adoption by creators and companies

With proper GPU infrastructure and Adobe Express integration,
BrandDiffusion can scale into a commercial AI design engine.

Built With

controlnet
gfpgan
groq-cloud-api
hugging-face-diffusers
llama-3
mediapipe
openai-clip
opencv
pillow
python
pytorch
remove.bg-api
scikit-learn
stable-diffusion-xl

Updates

HARSHDIP SAHA started this project — Jan 16, 2026 06:38 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.