Inspiration

Global sports markets are often treated as “efficient,” but in reality they are heavily influenced by human bias, incomplete data, and emotional narratives.

We noticed patterns:

  • Popular teams are often overvalued
  • Underdogs (especially from less-covered regions) are systematically underestimated
  • Early World Cup odds are set with limited information and slow adjustment

This raised a key question:

What if we could combine historical data, machine learning, and live market odds to systematically detect where the market is wrong?

WorldCup-Oracle was built to answer that.

What it does

WorldCup-Oracle is an AI-powered prediction engine that identifies mispriced teams in World Cup betting markets.

It:

  • Predicts team performance probabilities using machine learning
  • Ingests real-time odds from prediction markets
  • Compares model predictions with market-implied probabilities
  • Highlights undervalued and overvalued teams

Users can:

  • Explore predictions through an interactive dashboard
  • Compare teams in-depth
  • Identify value opportunities using a ranked “edge” system

How we built it

We built a full-stack system with four main layers:

1. Data Layer

  • Historical World Cup data (2002–2022) from FBref
  • Team metrics: xG, defense, possession, squad strength
  • Live odds from Polymarket (GraphQL) and Kalshi (REST)

2. Modeling Layer

  • Gradient Boosting ensemble (XGBoost/LightGBM)
  • Leave-one-tournament-out cross-validation
  • Outputs probabilities for:

    • Winning
    • Reaching semifinals
    • Group stage exit

3. Market Analysis Layer

  • Converts odds into implied probabilities
  • Removes market margin (vig normalization)
  • Computes edge = model – market

4. Interface & API

  • Streamlit dashboard (interactive charts, comparisons)
  • FastAPI backend with endpoints:

    • /predictions
    • /mispricing
    • /backtest

Challenges we ran into

  • Data inconsistency: Historical World Cup data across years required normalization and cleaning to maintain feature consistency.

  • Market normalization: Removing bookmaker margins and aligning different market formats (Polymarket vs Kalshi) was non-trivial.

  • Overfitting risk: With limited World Cup samples, ensuring the model generalized required careful validation (leave-one-out tournaments).

  • Real-time integration: Handling live odds reliably while maintaining performance required caching and fallback strategies.

🏆 Accomplishments that we're proud of

  • Achieved 83% performance improvement over baseline market predictions in backtesting
  • Built a complete pipeline from data → model → market → insights → UI
  • Identified real, quantifiable mispricings (e.g., Argentina +18% edge)
  • Delivered both:

    • A user-friendly dashboard
    • A developer-ready API

Most importantly:

We proved that combining AI with market data creates measurable advantage.

🧠 What we learned

  • Markets are strong—but predictably biased in certain scenarios
  • Blending model predictions with market data is more powerful than either alone
  • Validation strategy matters more than model complexity
  • Good visualization is critical for turning data into decisions

🚀 What's next for WorldCup-Oracle

  • Expand beyond the World Cup into:

    • Club football (Champions League, leagues)
    • Other sports (NBA, NFL, etc.)
  • Improve the model with:

    • Player-level data
    • Injury tracking
    • Real-time form updates
  • Add:

    • Automated alerts for new mispricings
    • Mobile-friendly interface
    • User personalization
  • Explore integration with:

    • Live betting tools
    • Portfolio tracking systems

WorldCup-Oracle is more than a prediction tool it’s a system designed to find truth where markets are imperfect.

Built With

Share this project:

Updates