Inspiration
Neither of us had touched ML or AI before this challenge — we treated the Smadex prompt as an excuse to learn it from scratch. The brief asked for a tool that helps advertisers understand why a creative wins or fades, and that question turned out to be the perfect playground: it forces you to combine tabular stats, image features, clustering, regression, and explainability all in one place. Building a "creative co-pilot" instead of yet another dashboard felt like the most honest way to put what we were learning to work.
What it does
Smadex Creative Copilot ingests the raw campaign and creative tables and produces a fully self-contained HTML dashboard wrapped around a Gemini-powered assistant that turns the underlying ML into plain language for marketers.
The generative layer is the heart of the product:
- A persistent chatbot with function-calling access to the dataset — it can pull a creative by id, list fatigued assets, run the portfolio optimiser, or compute predicted lift, then narrate the result in context.
- Cluster personas: Gemini reads each cluster's SHAP profile and writes a short strategic description ("Bold-CTA mobile installers", "Soft-tone retargeters", …) so the eight K-Means groups become something a human can actually reason about.
- Per-creative AI analysis in the modal: a streamed Gemini explanation that mixes SHAP drivers, fatigue signals and the KPI goal into one paragraph of advice.
- Metric-in-context explanations ("why is this CTR good for this vertical and audience?") generated on demand.
- Predict Creative + AI Smart Fill: the user describes a draft, Gemini suggests sensible defaults for the missing features, and the ML model returns a calibrated perf-score with explanation.
- All Gemini I/O goes through a single thin wrapper (
callGemini/callGeminiStream) with retry, streaming, and tool-call dispatch, so every AI surface in the app shares the same prompt-engineering and safety layer.
Around that, the dashboard offers Explore, Fatigue, Recommendations, ML Insights (feature importance, winning DNA, \(R^2\)/MAE per target), Clusters, Portfolio Optimizer, and a Predict Creative tool.
How we built it
The pipeline is split in two halves:
ml_pipeline_sklearn.py— pandas joins the four CSVs, Pillow extracts 9 image features per asset (brightness, contrast, edge density, saturation, …), and we build a 34-dim feature matrix (tabular + label-encoded categorical + binary flags + image). AfterSimpleImputer+StandardScalerwe fit:PCAto \(d' = \min(18, d, n-1)\) dimensions,KMeans(k=8)for personas,RandomForestRegressor(200 trees) forperf_score, CTR, CVR, retention,XGBRegressorper KPI whenxgboostis available,shap.TreeExplainerto attribute each prediction back to features and to build cluster SHAP profiles.
build_dashboard.pyinjectsapp_data.js,ml_results.jsonand the Gemini key into a singledashboard_template.html, producing one shareable file with Chart.js visualisations and the Gemini chatbot wired up via the REST API.
Diversity in the Portfolio Optimizer is a weighted blend of categorical novelty (65%) and cosine distance in the first 6 PCs (35%):
$$ \text{div}(c, P) = 0.65\,\bigl(1 - \tfrac{1}{|P|}\sum_{p\in P}\mathbf{1}[c_k = p_k]\bigr) + 0.35\,\min_{p\in P}\bigl(1 - \tfrac{\langle c, p\rangle}{\lVert c\rVert\lVert p\rVert}\bigr) $$
We also leaned heavily on Claude as a coding assistant throughout the build — pair-programming the scikit-learn pipeline, debugging SHAP shape mismatches, iterating on the dashboard's Chart.js layouts, and reviewing the final code for bugs and security issues. It accelerated the "learn-while-shipping" loop enormously and let us spend more of our time on product decisions than on syntax.
Challenges we ran into
- New to ML, so a lot of concepts (cross-validation, SHAP, PCA) had to be learned just-in-time as the pipeline grew.
- The curse of dimensionality bit us hard with our first attempt at portfolio diversity, a Gram-Schmidt orthogonalisation in even a 15D PCA'd vector space. Vectors became almost orthogonal by default (\(\mathbb{E}[\cos\theta] \to 0\) as \(d \to \infty\)), so every candidate looked "novel". We replaced it with PCA-projected cosine distance + categorical novelty, which behaves intuitively.
- Not enough data. With only ~1k creatives split across 8 clusters, the per-segment models were noisy; we leaned on cross-validation and conservative confidence intervals to avoid overselling.
- Latency. Querying the ML model on every UI interaction was a non-starter for a static HTML app, so we precompute SHAP once and use cluster centroids + PCA distances as cheap heuristics for "similar creatives", "predict-this", and the Optimizer. The expensive Gemini calls are reserved for explanation, not scoring.
- Packaging it as a single file that still talks to Gemini, without leaking the key in the source repo (we now build with a placeholder and inject at deploy time).
Accomplishments that we're proud of
- Shipping an end-to-end ML + LLM product — pipeline, model, explainability, and a Gemini-driven UI — having started from zero.
- A genuinely explainable prototype: every recommendation traces back to SHAP values and visible cluster context, then gets translated to natural language by Gemini.
- Replacing the broken Gram-Schmidt diversity metric with one whose behaviour we can defend mathematically.
- A polished, single-file dashboard that runs offline and still embeds an LLM assistant with function-calling.
What we learned
- The fundamentals of supervised learning, regularisation, cross-validation, and why \(R^2\) alone lies.
- Why dimensionality reduction matters before any geometric reasoning over features.
- How SHAP turns a tree ensemble into something a marketer can actually read.
- That good UX is half of explainability — a SHAP bar chart only helps if the surrounding copy tells the user what to do with it.
- How to integrate an LLM responsibly: function-calling for facts, prompt-engineering for tone, and never letting it invent numbers the dashboard already has.
What's next for Smadex Creative Copilot
- A real Computer Vision model (small CNN or CLIP embeddings) instead of hand-crafted PIL features.
- Bayesian uncertainty on predictions so the UI can say "we don't know yet" instead of guessing.
- Online learning: ingest new daily stats and update cluster assignments + perf scores incrementally.
- Multi-tenant mode for several advertisers, with per-account fine-tuning.
- A/B-test recommender that proposes the next creative variant rather than just scoring existing ones.
Built With
- chart.js
- gemini
- html5
- javascript
- kmeans
- numpy
- pandas
- pillow
- python
- randomforest
- scikit-learn
- shap
- xgboost
Log in or sign up for Devpost to join the conversation.