Inspiration
We are a group of sophomores from Singapore University of Technology and Design, studying Engineering Systems and Design and our interests lie in leveraging data to innovate tech solutions. Our team is thrilled to be joining the TikTok Hackathon Challenge 2023, where we focused on optimizing the social media advertisement moderation process. In a digital era where content safety is paramount, we were eager to create a dynamic stochastic optimization model that intelligently scores and prioritizes advertisements for review, ensuring efficient content-moderator matching. With a shared passion for enhancing online safety and user experience, we're committed to leveraging our expertise to make a positive impact on TikTok's advertisement moderation process.
Our team: TikTok on the Clock
Assigned Team Number: (1)
- Emily Chee
- Emily is interested in delving into and enhancing the digital world. With a focus on data analytics, she is learning to master the art of managing vast datasets and enhancing data accuracy. Her experience as an avid social media user, gives her a unique insight into user experiences and trends. And yes, she's a proud coffee addict.
- Emily's LinkedIn
- Jeanelle Boey
- Jeanelle is passionate about harnessing the power of big data to unearth innovative solutions within the tech industry, with a specific focus on advancing machine learning models. During her leisure hours, she indulges in the art of baking and in crafting the perfect matcha latte.
- Jeanelle's LinkedIn
- Joash Tan
- Joash is fascinated in the wonders of how digitalisation can revolutionize the future of our world. With experience in web/app development, he is built with the skill sets for front-end development and UI/UX but his passion still lies in data analytics. In his free time, he enjoys writing music-related blogs and doubles as a part-time physics tutor.
- Joash's LinkedIn
- Justin Wong
- Justin's interests lie in tech and entrepreneurship, and he also enjoys fishing during his free time.
- Justin's LinkedIn
- Raynard Chai
- Raynard is driven by a passion for empowering businesses to scale sustainably while making a positive social impact. Being fluent in business analytics, operations research, and social science, he brings a diverse skill set to the table. Outside of his professional pursuits, Raynard enjoys regular runs and indulges in photography as a creative outlet.
- Raynard's LinkedIn
What it does
Our solution is a data science pipeline that ingests two datasets — ads and moderators respectively — and returns a dataset of moderators with advertisements assigned to them.
Methodology
Our Approach:
Ad Scoring:
We propose a weighted system to score the advertisements. We will be using use scaled features 'punish_num' and 'avg_ad_revenue'. The formula we employed is:
ad score = w_1 x punish num + w_2 x avg ad revenue
Ads are then ranked from highest to lowest score, indicating their priority for review.
Moderator Scoring:
Recognizing the importance of having adept moderators, we've also devised a scoring system for them. The scoring is influenced by the moderator's average handling time (inverse relation) and their accuracy. The mathematical representation is:
moderator score = v_1 x (1/handling time) + v_2 x accuracy
Following this, moderators are ranked from highest to lowest indicating their proficiency.
Matching Algorithm:
Subsequently, we deployed a matching algorithm to pair high-priority ads with top-performing moderators, taking into account the ads' delivery countries and the moderators' market expertise. This seeks to achieve a synergetic match where both the ad's significance and the moderator's competence are optimized.
Optimization Using Genetic Algorithm:
To realize the twin objectives of revenue maximization and risk minimization, we've framed the following objective functions:
Estimated revenue rate = accuracy x avg ad revenue/handling time
Estimated riskiness = (1 - accuracy) x punish num
These functions will be solved using a genetic algorithm to reach an optimal solution.
How we built it
We first conducted exploratory data analysis in order to understand the advertisement moderation process and used that understanding to mathematically model and formulate the process into a multi-objective optimization problem. We aimed to maximise estimated revenue rate while minimising estimated riskiness. In order to achieve this, we had to preprocess the data (due to it being an original data dump), create a scoring function, matching algorithm, and used a Non-dominated Sorting Genetic Algorithm (NSGA-II) to solve the optimization problem. We then used compromise programming, a multi-criteria decision-making technique to select a solution from the Pareto front that was generated from the NSGA-II. That solution was then used to produce the final dataset of moderators with advertisements assigned to them.
Challenges we ran into
While our team has experience with optimization and machine learning, tackling TikTok ad rankings turned out to be trickier than expected. It's not merely a matter of plotting Y against X and applying a predictive model. Our journey into this challenge unveiled a complex terrain with various nuances.
Initially, we delved into the realm of queueing processes, drawing inspiration from models like the Jackson Network. Our brainstorming sessions generated a multitude of innovative ideas. However, we hit a roadblock. We realised there isn't a one-size-fits-all solution. The nature of TikTok, with its diverse content and ever-changing trends, made the problem more complex.
Accomplishments that we're proud of
We assessed our options based on what we could realistically achieve in the given time frame. We understood that in a dynamic space like TikTok, what's considered the 'best' solution can change rapidly.
Ultimately, we chose a solution that made sense to us based on the problem's complexity, our resources, imperfect information at hand, and the hackathon's time constraints. While it might not be a perfect solution, it's a step forward in improving the moderation process at TikTok.
What we learned
We deepened our understanding of optimization techniques, particularly through the use of pymoo and genetic algorithms. This experience allowed us to appreciate the power and versatility of these tools in solving real-world problems like this TikTok ad ranking scenario.
We also learnt the significance of mathematical modeling in tackling complex challenges. As we delved into data exploration, we realized that our subsequent understanding of the advertisement moderation process could be translated into effective mathematical models
What's next for TikTok on the Clock
Given more time, we could have also focused on fine-tuning the parameters for the model to improve both the speed and quality of our solutions. Because our solution was built on certain assumptions, our team ultimately hopes to work closely with the data science team at TikTok, so that we can model the process even better and improve our optimzation model.
Log in or sign up for Devpost to join the conversation.