Inspiration

Prediction markets are supposed to be efficient — prices reflecting the true probability of an outcome in real time. But watch an NBA game closely and you'll see something strange: when the Lakers go on a 12-0 run in the third quarter, the betting markets barely move. There's a lag. And in that lag, there's an opportunity.

We wanted to find out: could we build a system fast enough and smart enough to live in that gap?

What it does

Courtside Alpha is a fully automated trading system that watches live NBA games, predicts win probabilities in real time, and executes trades on Polymarket — a decentralized prediction market — whenever our model disagrees with the crowd by a meaningful margin.

At its core is a machine learning model trained on 188 features: not just score and time remaining, but momentum shifts, team-specific clutch performance, fatigue dynamics, rolling form trajectories, and in-game statistical rhythms. When the model thinks a team's true win probability is 70% but the market is only pricing them at 55%, it fires.

This isn't a simulation. The system signs and submits real limit orders directly to Polymarket's on-chain order book. Every trade shown in our dashboard is a real signed transaction.

How we built it

The system has four layers that talk to each other in real time:

  1. Data Layer — Polls the NBA live API for scores, period, and time remaining every few seconds, while simultaneously pulling real-time odds from Polymarket's Gamma API.

  2. ML Prediction Server — Runs an ensemble of four XGBoost models with distinct roles: a proxy model (trained on pregame odds to anchor baseline expectations), a live win classifier (predicts P(home wins) from live game state), a margin regressor (predicts final point spread), and an edge classifier (predicts whether our signal is reliable enough to act on). All trained on 980+ games of play-by-play data from the current NBA season with out-of-fold validation to prevent overfitting.

  3. Rust Execution Engine — Connects to Polymarket's WebSocket feed for live odds, computes the edge between our model and the market, and submits signed CLOB orders when the confidence threshold is met. Kelly criterion sizing ensures we bet proportionally to our conviction.

  4. Live Dashboard — Built in Next.js, shows open positions, P&L, model vs. market probabilities, and every active game — updated in real time.

Challenges we ran into

The hardest part wasn't the ML — it was the plumbing. NBA game clocks don't map cleanly to market timestamps. Polymarket lists teams in a different order than the NBA API. A "home team" in one system is an "away team" in another, and getting those wires crossed means your model is betting backwards.

We caught and fixed a critical data leakage bug during training: our rolling team statistics were accidentally including the current game's result, making the model appear near-perfect in backtesting but useless in production. Implementing proper .shift(1) and out-of-fold validation was the difference between a model that memorizes and one that generalizes.

We also discovered a classic algo-trading pitfall: without a cooldown near game-end, the bot would detect an edge, sell, then immediately re-buy on the very next tick as the market snapped back — cycling hundreds of trades in the final two minutes. We added time-decay guards that shut off new position entry when fewer than 120 seconds remain.

Accomplishments that we're proud of

We built a complete quantitative trading system in two days — from raw NBA play-by-play data to signed on-chain transactions. The pipeline scrapes 980+ games, engineers 188 features, trains four specialized models, serves predictions via a FastAPI server, executes trades through a Rust engine, and displays everything on a live dashboard.

Our model achieved a Brier score of 0.1888 on out-of-fold live win predictions, meaningfully better than both a coin flip (0.25) and our pre-game proxy (0.2454). During live testing, the system correctly identified the winner in all three games it tracked, detecting actionable edges in competitive matchups where the market was slow to adjust.

Most importantly, we built the system to be honest with itself. Out-of-fold validation, edge quality filtering, and Kelly-criterion position sizing mean the bot only trades when it has genuine conviction — not just a number that looks good.

What we learned

Markets are slower than you think — and faster than your code. Real-time trading demands correctness at every layer. A single sign-flip bug doesn't show up until you're looking at a position that should be up 20% but is somehow down.

We also learned that market inefficiency is real, but narrow. Our model finds edges of 5–15% fairly regularly. The question is always: is the market wrong, or is our model wrong? Building confidence into that answer — rather than just chasing every signal — is what separates a strategy from a guess.

What's next for Courtside Alpha

Continuous learning from live data. Every game the bot trades generates a new row of ground truth — model prediction, market price, and actual outcome. Over time, this growing dataset allows us to retrain on real market odds rather than simulated proxies, fine-tune edge thresholds based on observed profitability, and detect when market dynamics shift. The goal is a system that gets sharper with every game night, not one that stays frozen at its training snapshot.

Expand to multi-market trading. We currently trade moneylines. Adding spread and over/under markets — which are less liquid and potentially less efficient — could unlock additional alpha.

Sequence modeling. Our current XGBoost models treat each game snapshot independently. An LSTM or transformer architecture that sees the sequence of game states could learn temporal patterns like "teams that go on runs in Q3 tend to extend them into Q4" — something tree-based models fundamentally can't capture.

Cross-platform arbitrage. We're already pulling odds from both Polymarket and Kalshi. When the two platforms disagree on the same game, that's a risk-free edge — and our infrastructure is already built to detect it.

Built With

Share this project:

Updates