Inspiration

We saw AppLovin’s challenge as an opportunity to help Axon understand what makes a creative work — not just who to show it to. Our team loves multimodal modeling and wanted to bring more interpretable intelligence into real-time ad recommendations.

What it does

Our system converts each ad (image or video) into a latent embedding and projects it onto five human-interpretable creative style axes:

wealthy, limited-offer, calm, honest, certified

Each ad gets a 5-dimensional vector: \( \mathbf{f}(x) = \left( \langle \hat{z}x, \hat{w}{\text{wealthy}} \rangle,, \langle \hat{z}x, \hat{w}{\text{limited}} \rangle,, \ldots \right) \) capturing how strongly that creative expresses each concept.

How we built it

  1. Use ImageBind to embed each image/video: (\mathbf{z} \in \mathbb{R}^{1024})
  2. Perform PCA on all creatives to find high-variance semantic directions.
  3. Select interpretable axes aligned with principal components using text embedding similarity.
  4. Normalize embeddings and extract cosine-based activations: \( \text{feature}_i(x) = \frac{\mathbf{z}_x}{|\mathbf{z}_x|} \cdot \frac{\mathbf{w}_i}{|\mathbf{w}_i|} \quad \in [-1, 1] \)

This transforms raw creatives into features that are: distinctive, predictive, and scalable to millions.

Challenges we ran into

  • Few labeled ads → needed unsupervised structure discovery
  • Avoiding meaningless “low-correlation” features that barely activate
  • Ensuring each axis corresponded to recognizable strategy rather than noise
  • Multimodal video + image handling without quality loss

Accomplishments that we're proud of

  • Unified video + image features in one embedding space
  • Found orthogonal and explainable ad attributes
  • Reduced complexity: 1024 → 5 meaningful dimensions
  • Feature activations matched real creative classes we observed qualitatively

What we learned

The best features are not just decorrelated — they must:

  • show up often,
  • align with marketing intuition, and
  • differentiate user-facing persuasion strategies.

What’s next for Ad-Feature-Challenge

  • Automatically tag incoming creatives for Axon
  • Extend dimensions using weak supervision + OCR cues

Our technical paper which writes up everything in detail is available here: https://drive.google.com/file/d/1hu5jRldgn0yXB-3oFjAGVWi_pctjlQ9t/view?usp=sharing

Built With

Share this project:

Updates