Offside Market: Finding Where the Crowd Gets It Wrong
Inspiration
Prediction markets are supposed to be the smartest aggregator of human belief: real money, real stakes, real signal. But soccer has always been a sport where regional bias, recency bias, and narrative override data. France gets hyped after a strong qualifying campaign. A historic powerhouse carries odds that don't reflect a depleted squad. The crowd isn't wrong often, but when it is, the gap is measurable.
With the 2026 FIFA World Cup approaching fast, I wanted to answer one question: can a performance-based model systematically find where Polymarket is mispricing teams?
How I Built It
1. Data Pipeline
Historical match data came from the martj42 international results dataset covering 732 competitive international fixtures from 2019-2024 across World Cup qualifying, Copa America, Euros, AFCON, Gold Cup, and Nations League. Live market odds were pulled from the Polymarket CLOB API covering tournament winner contracts for all 48 qualified teams.
Note: FBref xG data required a browser-based scraper unavailable in the Zerve sandbox. Goals-based performance weighting was used as a proxy, which correlates strongly with xG in competitive international fixtures.
2. Team Strength Model
I built a performance-adjusted Elo rating system. Unlike traditional Elo which updates on wins and losses alone, this weights updates by the quality of performance:
$$\Delta R = K \cdot (S - E) \cdot \frac{G_{\text{for}}}{G_{\text{for}} + G_{\text{against}}}$$
Where $S$ is match outcome (1/0.5/0), $E$ is expected score from current ratings, and $K$ is scaled by competition weight (World Cup = 1.5x, friendlies = 0.3x).
3. Monte Carlo Tournament Simulation
Using the actual 2026 bracket draw, I ran 10,000 full tournament simulations. Each match probability is derived from the two teams' Elo ratings via the standard logistic function:
$$P(\text{win}) = \frac{1}{1 + 10^{(R_B - R_A)/400}}$$
The output is a win-probability distribution for every team, independent of market sentiment.
4. Market Delta Detection
The core insight layer: compare model probabilities against live Polymarket prices to surface the largest discrepancies. A team trading at 8% on Polymarket that my model rates at 18% is offside.
Top findings:
- France: market 16.3% vs model 7.8% — overvalued by 8.5 points
- England: market 11.0% vs model 3.5% — overvalued by 7.5 points
- Japan: market 2.1% vs model 6.3% — undervalued by 4.1 points
- South Korea: market 0.3% vs model 3.5% — undervalued by 3.2 points
Challenges
FBref access in a sandboxed environment. The soccerdata library requires a Chrome browser to scrape FBref, which is unavailable in Zerve's Lambda environment. Pivoted to the martj42 international results dataset which covers the same fixtures without browser dependency.
48 teams is a new format. There is no historical calibration data for a 48-team World Cup bracket. The group stage structure is genuinely novel, which means the simulation carries more uncertainty than ideal, and that uncertainty is part of what makes it interesting.
Market liquidity varies wildly. A top-8 favorite has deep, efficient pricing. A group-stage dark horse might have thin order books. The model surfaces volume alongside each delta so readers can weight findings accordingly.
What I Learned
Zerve fundamentally changed how fast I could iterate. What would normally be a week of disconnected Jupyter notebooks, API wrappers, and deployment headaches collapsed into a single focused workflow. The AI handled the boilerplate; I stayed in the analysis layer the whole time.
The biggest technical lesson: market prices and model probabilities are both wrong, just differently. The interesting projects live in the gap between them.
Log in or sign up for Devpost to join the conversation.