About the Project
Inspiration
Digital advertising generates massive amounts of creative data, yet decision-making remains largely intuitive. Marketers can see performance metrics like ROAS or CTR, but they often lack a clear understanding of why a creative works, when it starts to fail, and what to do next.
This project was inspired by a simple but powerful idea:
What if past creatives could act as an intelligent memory, guiding future decisions?
Instead of building another black-box predictive model, we designed a system that reasons like a human strategist—by comparing new creatives with past ones, identifying patterns, and explaining decisions in a transparent way.
How We Built It
We developed Creative Memory Copilot, a multimodal system based on Case-Based Reasoning (CBR), where each creative is treated as a historical case.
Multimodal Feature Pipeline
We combined multiple sources of information into a unified representation:
- Tabular data: campaign context, KPIs, lifecycle metrics
- Visual embeddings:
- CLIP (semantic understanding)
- ResNet50 (visual structure)
- Interpretable vision features: layout, color, texture (OpenCV)
- Florence-2 spatial reasoning: OCR, grounding, layout semantics
This results in a multimodal vector:
x_final = [x_tabular || x_CLIP || x_CNN || x_Florence]
Case-Based Reasoning Engine
Instead of predicting outcomes, the system retrieves similar creatives. Similarity is computed as a weighted average of cosine similarities across feature blocks:
S(q, c) = (Σ w_b · cos(x_q,b, x_c,b)) / (Σ w_b)
Each similarity block (CLIP, CNN, visual, text, context) contributes independently, enabling fine-grained explainability.
Decision Layer
On top of retrieval, we built logic to answer:
- Which creatives work best? → robust ranking with multi-metric scoring
- Which creatives are tired or repetitive? → fatigue + similarity density
- What should we test next? → pattern extraction from top-performing neighbors
Interactive Product
We implemented a full web app with:
- Creative ranking and explainability
- Fatigue/repetition analysis
- Recommendation engine (DO / DON’T)
- 2D creative landscape (UMAP/t-SNE)
All backed by a structured database and retrieval engine.
Challenges We Faced
1. Data Leakage (Critical)
One of the biggest challenges was avoiding misleading signals.
Variables like:
last_7d_ctrlast_7d_cvr- lifecycle KPIs
look extremely predictive—but they leak future information.
We solved this by:
- separating prelaunch vs early vs lifecycle features
- enforcing strict feature sets via metadata JSON
- designing the CBR to operate only on valid decision-time data
2. Multimodal Alignment
Combining:
- tabular data,
- embeddings,
- and spatial reasoning
is non-trivial.
We had to:
- normalize per block
- design weighted similarity
- validate retrieval quality offline
3. Explainability vs Performance Trade-off
Most high-performing systems are black boxes.
We intentionally chose a harder path:
- keep performance high
- while ensuring every decision can be explained
This required:
- interpretable visual features
- Florence-2 semantic grounding
- block-level similarity breakdown
4. Creative Semantics
Images are complex:
- same layout ≠ same meaning
- same concept ≠ same performance
Using:
- CLIP (semantic)
- CNN (visual)
- Florence (spatial reasoning)
was essential to capture the full picture.
What We Learned
1. Retrieval > Prediction in Creative Problems
For creative strategy, finding good analogies is often more useful than predicting a number.
CBR proved to be:
- more interpretable
- more actionable
- closer to human reasoning
2. EDA is Not Optional
Deep exploratory analysis revealed:
- leakage traps
- duplicated features
- misleading correlations
Without it, the system would have been fundamentally flawed.
3. Vision Needs Semantics
Embeddings alone are not enough.
Florence-2 enabled:
- spatial understanding
- human-readable explanations
- actionable insights
This became the key differentiator of the project.
4. Balance Between Signals Matters
Each modality contributes differently:
- CLIP → concept
- CNN → structure
- Florence → layout & meaning
- Tabular → performance context
The system only works when these are properly balanced.
Final Outcome
We built more than a model.
We built a system that:
- remembers past creatives
- explains performance
- detects fatigue and repetition
- recommends what to test next
All grounded in real data and interpretable reasoning.
Creative Memory Copilot turns historical data into a decision-making engine for creative strategy.
Log in or sign up for Devpost to join the conversation.