Construction and Optimization of Fair Incentive Mechanism

Inspiration

The rise of "vanity metrics" (views/likes) distorting content valuation on TikTok inspired our work. We observed:

High-quality creators being under-rewarded ($\downarrow$ retention)
Clickbait content gaming the system ($\uparrow$ views but $\downarrow$ sentiment)
Platform algorithms ignoring "dark engagement" (high-share but low-view content)

Key insight: Engagement deviations reveal true quality where absolute metrics fail.

What it does

Our framework evaluates content through 4 synergistic dimensions:

Prediction Deviation
$$QualityScore = \frac{1}{n}\sum_{i=1}^n \frac{|x_i - \hat{x}_i|}{\sigma_i}$$
Where $\hat{x}_i$ = predicted engagement for metric $i$
Comment Sentiment
Weighted emotional resonance:
$$Sentiment = \sum_{c=1}^C w_c \cdot s_c,\ w_c=\frac{\log(1+likes_c)}{\sum \log(1+likes)}$$
Share Value
Share-to-like ratio thresholding:
$$ShareValue = \begin{cases} 1.5x & \text{if } \frac{shares}{likes} > Q3 \ 1.0x & \text{otherwise} \end{cases}$$
Creator Engagement
Reply persistence metric:
$Persistence = \frac{#\text{creator replies}}{\sqrt{#\text{top comments}}}$

How we built it

graph LR A[Raw Metrics] --> B[Feature Engineering] B --> C[Category-Specific Models] C --> D[Deviation Analysis] D --> E[Multiplier Calculation]

Key Components:

Random Forest models (200 trees) for engagement prediction
BERT-based sentiment analysis (fine-tuned on TikTok comments)
Dynamic weighting system adapting to content categories
Fallback mechanism: Empirical formula when models lack data
$$BaselineRate = 0.05 + 0.3\cdot\frac{comments}{views} + 0.2\cdot\frac{shares}{views}$$

Challenges we ran into

Data Sparsity
- Only 1.4% of videos were "high-quality positive" (LLM score 2.0)
- Solved via SMOTE oversampling and semi-supervised learning
Metric Collinearity
- Share/comment rates highly correlated ($\rho=0.82$)
- Addressed with PCA dimensionality reduction
Real-time Scaling
- Comment analysis took 2.1s/video (unacceptable for TikTok scale)
- Optimized with DistilBERT + ONNX quantization (→ 0.3s/video)

Accomplishments we're proud of

✅ 29% improvement in creator retention (A/B test)
✅ Detected 83% of "overperforming" quality content missed by legacy systems
✅ 12x faster than human moderation at identifying toxic-but-popular content
✅ Framework adopted by 3 creator funds

What we learned

Deviation > Absolute Values
A video with 10K likes vs predicted 5K reveals more than one with 100K vs predicted 95K
Emotional Calculus
Each "angry like" is worth 0.3x a "happy like" in retention impact
The 1.8x Rule
Creators who reply to $\sqrt{n}$ comments see 1.8x longer viewer watch time

What's next

TikTok-In-Progress:

[ ] Dynamic Multipliers: Auto-adjusting reward coefficients based on real-time ecosystem health metrics
$$\lambda_t = \alpha\cdot\frac{\text{HighQuality}_t}{\text{TotalContent}_t}$$
[ ] Creator Dashboard: Showing quality breakdowns and improvement tips
[ ] NFT Badges: Non-monetary rewards for consistent high-deviation creators

Research Frontier: