web1-home-all-creative
web2-explain-creative
web3-Best-creatives
web4-Best-creatives-explain-img-Florence2
web5-Worst-creatives
web6-Worst-creatives-explain-img-Florence2
web7-Next-test
web8-Umap

About the Project

Inspiration

Digital advertising generates massive amounts of creative data, yet decision-making remains largely intuitive. Marketers can see performance metrics like ROAS or CTR, but they often lack a clear understanding of why a creative works, when it starts to fail, and what to do next.

This project was inspired by a simple but powerful idea:

What if past creatives could act as an intelligent memory, guiding future decisions?

Instead of building another black-box predictive model, we designed a system that reasons like a human strategist—by comparing new creatives with past ones, identifying patterns, and explaining decisions in a transparent way.

How We Built It

We developed Creative Memory Copilot, a multimodal system based on Case-Based Reasoning (CBR), where each creative is treated as a historical case.

Multimodal Feature Pipeline

We combined multiple sources of information into a unified representation:

Tabular data: campaign context, KPIs, lifecycle metrics
Visual embeddings:
- CLIP (semantic understanding)
- ResNet50 (visual structure)
Interpretable vision features: layout, color, texture (OpenCV)
Florence-2 spatial reasoning: OCR, grounding, layout semantics

This results in a multimodal vector:

x_final = [x_tabular || x_CLIP || x_CNN || x_Florence]

Case-Based Reasoning Engine

Instead of predicting outcomes, the system retrieves similar creatives. Similarity is computed as a weighted average of cosine similarities across feature blocks:

S(q, c) = (Σ w_b · cos(x_q,b, x_c,b)) / (Σ w_b)

Each similarity block (CLIP, CNN, visual, text, context) contributes independently, enabling fine-grained explainability.

Decision Layer

On top of retrieval, we built logic to answer:

Which creatives work best? → robust ranking with multi-metric scoring
Which creatives are tired or repetitive? → fatigue + similarity density
What should we test next? → pattern extraction from top-performing neighbors

Interactive Product

We implemented a full web app with:

Creative ranking and explainability
Fatigue/repetition analysis
Recommendation engine (DO / DON’T)
2D creative landscape (UMAP/t-SNE)

All backed by a structured database and retrieval engine.

Challenges We Faced

1. Data Leakage (Critical)

One of the biggest challenges was avoiding misleading signals.

Variables like:

last_7d_ctr
last_7d_cvr
lifecycle KPIs

look extremely predictive—but they leak future information.

We solved this by:

separating prelaunch vs early vs lifecycle features
enforcing strict feature sets via metadata JSON
designing the CBR to operate only on valid decision-time data

2. Multimodal Alignment

Combining:

tabular data,
embeddings,
and spatial reasoning

is non-trivial.

We had to:

normalize per block
design weighted similarity
validate retrieval quality offline

3. Explainability vs Performance Trade-off

Most high-performing systems are black boxes.

We intentionally chose a harder path:

keep performance high
while ensuring every decision can be explained

This required:

interpretable visual features
Florence-2 semantic grounding
block-level similarity breakdown

4. Creative Semantics

Images are complex:

same layout ≠ same meaning
same concept ≠ same performance

Using:

CLIP (semantic)
CNN (visual)
Florence (spatial reasoning)

was essential to capture the full picture.

What We Learned

1. Retrieval > Prediction in Creative Problems

For creative strategy, finding good analogies is often more useful than predicting a number.

CBR proved to be:

more interpretable
more actionable
closer to human reasoning

2. EDA is Not Optional

Deep exploratory analysis revealed:

leakage traps
duplicated features
misleading correlations

Without it, the system would have been fundamentally flawed.

3. Vision Needs Semantics

Embeddings alone are not enough.

Florence-2 enabled:

spatial understanding
human-readable explanations
actionable insights

This became the key differentiator of the project.

4. Balance Between Signals Matters

Each modality contributes differently:

CLIP → concept
CNN → structure
Florence → layout & meaning
Tabular → performance context

The system only works when these are properly balanced.

Final Outcome

We built more than a model.

We built a system that:

remembers past creatives
explains performance
detects fatigue and repetition
recommends what to test next

All grounded in real data and interpretable reasoning.

Creative Memory Copilot turns historical data into a decision-making engine for creative strategy.

Built With

case-based-reasoning
clip
csv
fastapi
florence-2
hugging-face-transformers
interactive-2d-embedding-visualization
json
opencv
parquet
pca
python
pytorch
resnet50
sqlite
torchvision
umap
uvicorn

Updates

Sylborgar Borràs García started this project — Apr 26, 2026 02:39 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.