Inspiration
Global sports markets are often treated as “efficient,” but in reality they are heavily influenced by human bias, incomplete data, and emotional narratives.
We noticed patterns:
- Popular teams are often overvalued
- Underdogs (especially from less-covered regions) are systematically underestimated
- Early World Cup odds are set with limited information and slow adjustment
This raised a key question:
What if we could combine historical data, machine learning, and live market odds to systematically detect where the market is wrong?
WorldCup-Oracle was built to answer that.
What it does
WorldCup-Oracle is an AI-powered prediction engine that identifies mispriced teams in World Cup betting markets.
It:
- Predicts team performance probabilities using machine learning
- Ingests real-time odds from prediction markets
- Compares model predictions with market-implied probabilities
- Highlights undervalued and overvalued teams
Users can:
- Explore predictions through an interactive dashboard
- Compare teams in-depth
- Identify value opportunities using a ranked “edge” system
How we built it
We built a full-stack system with four main layers:
1. Data Layer
- Historical World Cup data (2002–2022) from FBref
- Team metrics: xG, defense, possession, squad strength
- Live odds from Polymarket (GraphQL) and Kalshi (REST)
2. Modeling Layer
- Gradient Boosting ensemble (XGBoost/LightGBM)
- Leave-one-tournament-out cross-validation
Outputs probabilities for:
- Winning
- Reaching semifinals
- Group stage exit
3. Market Analysis Layer
- Converts odds into implied probabilities
- Removes market margin (vig normalization)
- Computes edge = model – market
4. Interface & API
- Streamlit dashboard (interactive charts, comparisons)
FastAPI backend with endpoints:
/predictions/mispricing/backtest
Challenges we ran into
Data inconsistency: Historical World Cup data across years required normalization and cleaning to maintain feature consistency.
Market normalization: Removing bookmaker margins and aligning different market formats (Polymarket vs Kalshi) was non-trivial.
Overfitting risk: With limited World Cup samples, ensuring the model generalized required careful validation (leave-one-out tournaments).
Real-time integration: Handling live odds reliably while maintaining performance required caching and fallback strategies.
🏆 Accomplishments that we're proud of
- Achieved 83% performance improvement over baseline market predictions in backtesting
- Built a complete pipeline from data → model → market → insights → UI
- Identified real, quantifiable mispricings (e.g., Argentina +18% edge)
Delivered both:
- A user-friendly dashboard
- A developer-ready API
Most importantly:
We proved that combining AI with market data creates measurable advantage.
🧠 What we learned
- Markets are strong—but predictably biased in certain scenarios
- Blending model predictions with market data is more powerful than either alone
- Validation strategy matters more than model complexity
- Good visualization is critical for turning data into decisions
🚀 What's next for WorldCup-Oracle
Expand beyond the World Cup into:
- Club football (Champions League, leagues)
- Other sports (NBA, NFL, etc.)
Improve the model with:
- Player-level data
- Injury tracking
- Real-time form updates
Add:
- Automated alerts for new mispricings
- Mobile-friendly interface
- User personalization
Explore integration with:
- Live betting tools
- Portfolio tracking systems
WorldCup-Oracle is more than a prediction tool it’s a system designed to find truth where markets are imperfect.
Built With
- apis
- python
- streamlit
- zerve
Log in or sign up for Devpost to join the conversation.