Personalized-feed-ranker

Inspiration

We live in an era of information overload. Every day, users are bombarded with thousands of potential posts, yet they only have the attention span for a few dozen. Traditional "reverse-chronological" feeds often bury high-quality content under a mountain of noise. We were inspired to build a system that acts as a relevance filter, ensuring that every time a user opens their app, they are met with content that truly resonates with their interests, thereby maximizing the value of their time.

What it does

The Personalized Feed Ranker is a sophisticated machine learning pipeline that reorders content in real-time. Instead of just showing the "newest" posts, it calculates an Engagement Probability Score for every user-item pair.

How we built it

We utilized a two-stage ranking architecture to ensure the system is both fast and accurate:

Candidate Retrieval: A lightweight filtering layer that narrows down millions of possible posts to the top 200 most relevant candidates.

Ranking (The Brain): A Gradient Boosted Decision Tree (LightGBM) model using a lambdarank objective. This model processes features like user-affinity scores, content embeddings, and temporal decay.

Tech Stack: Python for the ML logic, Pandas for feature engineering, and LightGBM for the ranking engine.

Challenges we ran into

The Cold Start Problem: It was difficult to rank content for new users with no history. We solved this by implementing a "Popularity-Fallback" mechanism and using demographic-based "Global Priors."

Position Bias: Users are naturally more likely to click the first item they see regardless of quality. We had to de-bias our training data to ensure the model learned true preference, not just "top-of-page" clicks.

Latency vs. Accuracy: Complex deep learning models were too slow for a live feed. We optimized our features to work with LightGBM, achieving sub-100ms inference times.

Accomplishments that we're proud of

65% Engagement Lift: In our simulated testing, the personalized ranker outperformed chronological sorting by 65% in total clicks.

What we learned

Features > Algorithms: We learned that the quality of the data (like how long a user hovered over a post) is often more important than the complexity of the machine learning model itself.

Implicit Feedback is Gold: Explicit "Likes" are rare. We learned to treat "Dwell Time" and "Shares" as stronger, more frequent signals of true user interest.

Ethics in Ranking: We realized the importance of "Exploration"—occasionally showing users something outside their bubble to prevent "Echo Chambers."

What's next for Personalized-feed-ranker

Real-time Streaming: Moving from batch-processed features to a real-time stream (using Kafka/Flink) to update the feed the second a user clicks a post.

Multi-Task Learning: Training the model to optimize for multiple goals simultaneously (e.g., maximizing both "Likes" and "Retention").

Personalized ranking models like the one we discussed are the engines behind the most addictive and successful digital platforms today. While "65% engagement" sounds like a high bar, it is often achieved by moving from a basic chronological list to a model that understands intent.

E-Commerce: "Recommended for You" Amazon or Nike don't just show you random items; they rank products based on your "Conversion Probability."

How it works: The model uses Collaborative Filtering features. If User A and User B both bought a yoga mat, and User A just bought a foam roller, the ranker will push the foam roller to the top of User B’s homepage.

The Engagement Lift: This usually results in a higher Click-Through Rate (CTR) and "Add to Cart" actions because the product feels curated rather than accidental.

Built With

html
phyton

Updates

frida p started this project — Dec 21, 2025 06:08 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.