Inspiration

"What if you could turn users into embeddings?"

Every user on X has their own opinions, wants, goals, and motivations. AdRag aims to encapsulate all of the public data related to a user into 408-dimensional vectors and perform clustering.

How we built our project

To do this, we've come up with a novel embedding model that takes in the posts that the user has written, people that the user follows, the followers of that user, posts they've replied to, and more. We've procured this data for around 1143 users and generated embeddings on all of their data using the UMBC supercomputing cluster. We then train a GNN (Graph Neural Network) on the embeddings to make them context aware of each other (think following/followed information, post replies, etc). This pipeline of data provides us with a highly accurate, robust vector database that naturally clusters users with similar public data together.

The next step is simple: take a username, pull their public footprint from the X API, convert it into our unified embedding format, and drop it into our vector space. From there, we run a nearest-neighbor search to find the clusters they naturally fall into. In other words, a user can type in "@handle," and AdRag instantly shows who they think like. We surface the clusters closest to them based on embeddings.

Turning ads into embeddings

Once the user graph was working, we embedded ads as well. Each ad is transformed based on its text, sentiment, emotional tone, and creative intent. We generate an ad embedding in the same space as users, then compare it to existing clusters. This shows which communities will resonate with a message and why. It gives advertisers a clear view of who will care about an idea, grounded entirely in data.

Challenges we faced

Our biggest challenge was designing an embedding format that integrates many different signals and tuning the GNN so that it actually captured community structure. Limited API rate limits and fast vector search also required careful optimization of the pipeline.

What we learned

Building AdRag taught us how powerful embeddings become when they are combined with graph structure. We learned how user relationships, sentiment patterns, and posting behavior can be fused into a single representation that actually reflects social dynamics. We also learned a lot about large scale data pipelines, model introspection, and the practical challenges of working with real API data.

Built With

Share this project:

Updates