AD-EX

Upgrading a failing ad into a high-performing creative
AI pipeline for retrieving, analyzing, and generating improved ad creatives with text-safe diffusion editing
Accurate CTR prediction matching real world ad fatigue
Automated feature extraction and bounding box detection using our vision pipeline
AD-EX Logo

Inspiration

In programmatic advertising, creative fatigue is one of the biggest hidden costs. A mobile ad can perform well for a few days and then quickly lose user attention, forcing advertisers to constantly test, refresh, and redesign creatives.

We were inspired by a simple question:

What if an AI system could not only detect that an ad is getting tired, but also understand why it is failing and generate a better version automatically?

That idea became AD-EX: an AI-powered creative optimization platform that combines retrieval, multimodal understanding, generative editing, and performance forecasting.

What it does

AD-EX is an end-to-end AI pipeline for improving underperforming mobile ad creatives.

Given an original ad, the system analyzes its visual structure, compares it against historically stronger creatives, identifies what is missing, and generates improved variants while preserving important text regions.

The pipeline works in five main stages:

Creative Retrieval
AD-EX finds similar, better-performing ads using a custom retrieval score based on semantic similarity, campaign context, performance, confidence, creative quality, and health/fatigue signals.
Visual Understanding
The system uses SAM 2.1 to segment the ad into visual components and GPT-4o Vision to convert those components into structured semantic JSON descriptions.
Gap Analysis
An LLM compares the original ad with retrieved high-performing references and identifies actionable creative gaps, such as missing visual hooks, weak CTA placement, poor layout, low brand visibility, or ineffective messaging.
Generative Editing
Improvement prompts, original image information, and protected text masks are passed into a diffusion-based generation module to create upgraded ad variants without destroying key brand or text elements.
Performance Simulation and Insights
The generated creative is evaluated through predictive models and displayed in an interactive dashboard, where users can compare the original and improved versions.

How we built it

We built AD-EX as a modular full-stack system with a Python AI backend and a React frontend.

Backend

The backend is built with FastAPI and handles the orchestration of the AI pipeline. We designed the retrieval and scoring modules to be modular so that new metrics, embeddings, or business rules can be added without breaking the rest of the system.

The retrieval engine combines several signals:

$$ FinalScore = w_s S_{sim} + w_c S_{context} + w_p S_{performance} + w_q S_{quality} + w_r S_{confidence} + w_h S_{health} $$

where similarity measures how close two creatives are, context checks whether they belong to comparable campaign settings, performance captures historical results, confidence estimates data reliability, and health penalizes fatigue or decay.

Computer Vision and Multimodal Analysis

We use SAM 2.1 to identify relevant regions in the ad and protect text areas from destructive edits. Then, GPT-4o Vision analyzes the segmented elements and generates structured JSON metadata describing the layout, text, CTA, logo, visual hierarchy, and semantic content of the creative.

These JSONs are converted into embeddings and used by the retrieval engine to find similar successful ads.

Generative AI

The generation module uses the gap analysis output to create targeted improvement prompts. These prompts guide a diffusion-based image editing process, while text masks help preserve important copy such as CTAs, headlines, prices, and brand names.

Predictive Modeling

We also experimented with a custom PyTorch autoregressive RNN/GRU to simulate how creative performance evolves over time. The model combines visual embeddings with campaign and audience features such as country and operating system to forecast CTR dynamics.

Frontend

The frontend is built with React + Vite and styled for a clean dashboard experience. It allows users to inspect creatives, compare original and improved variants, and understand the reasoning behind each recommendation.

Challenges we ran into

One of the biggest challenges was making all parts of the pipeline work together reliably. SAM, GPT-4o Vision, embedding generation, retrieval, diffusion editing, and performance prediction all produce different types of outputs, so we had to design a clean intermediate representation using structured JSON metadata.

Another challenge was retrieval quality. A high-performing ad is not always a useful reference if it belongs to a completely different vertical, objective, or audience. We had to carefully balance similarity, context, and performance so that the system recommends creatives that are both better and meaningfully comparable.

We also had to deal with noisy synthetic data. Some dataset variables looked useful at first but were actually misleading or too engineered, so we refined our scoring metrics to avoid relying on suspicious columns and focused on more defensible signals.

Finally, preserving text during generation was difficult. Text is crucial in ads, but generative models can easily distort it. Using SAM-based text masks helped us protect key regions during the editing process.

Accomplishments that we're proud of

We are proud of building a complete AI workflow that goes beyond simple ad analysis. AD-EX can:

understand an ad visually and semantically,
retrieve better-performing references,
explain what the original creative is missing,
generate improvement prompts,
protect important text regions,
and present the results in an interactive product demo.

We are especially proud of the retrieval system because it combines multiple signals into a practical recommendation engine rather than relying on a single metric like CTR or ROAS.

What we learned

We learned how difficult it is to connect multimodal AI components into a single coherent product. Each model is powerful on its own, but the real challenge is designing the interfaces between them.

We also learned that retrieval quality depends heavily on metric design. Similarity alone is not enough, and performance alone can be misleading. A good recommendation system needs to balance semantic similarity, campaign context, historical performance, confidence, and fatigue.

On the engineering side, we gained experience building modular FastAPI services, managing asynchronous AI tasks, generating semantic embeddings, and creating a frontend capable of presenting complex AI reasoning in a user-friendly way.

What's next for AD-EX

The next step is to expand AD-EX from static image creatives to video ad optimization, where the system could detect weak frames, segment important visual regions, and generate improved variants over time.

We also want to improve the forecasting module so it predicts not only CTR, but also business-level metrics such as conversions, revenue, and ROAS.

Finally, we would like to add a stronger conversational layer, allowing advertisers to ask questions such as:

“Make this version more premium,”
“Keep the CTA but change the background,”
“Find examples that performed better with younger audiences,”
“Generate a variant optimized for Android users in Spain.”

Our long-term vision is for AD-EX to become a creative copilot for performance marketing teams: a system that can analyze, explain, generate, and iterate on ad creatives automatically.