CourtVision

Inspiration

We know how hard it is to get into a sport like basketball, with the overwhelming amount of stats, large and storied history, and the crazy emphasis on being up-to-date with the chaotic state of the leagues. We hope to ease that with CourtVision, giving everyone a level view of the court, particularly with a Chat Bot ready to answer any question about basketball.

We were also inspired by the gap between what sports media does (hot takes, vibes, narratives) and what the data actually says (Skip Bayless). CourtVision attempts to bridge that gap by not just predicting game outcomes, but assessing the accuracy of stories painted by media.

What It Does

CourtVision is a real-time NBA intelligence platform with three core features:

Game Predictions: For every game on tonight's slate, CourtVision generates a machine learning-powered prediction including projected winner, predicted final score, win confidence, and three key analytical factors driving the outcome. Predictions are powered by a trained XGBoost model using rolling team stats, rest differentials, home/away splits, offensive and defensive ratings, and pace data.

Player Cards: Dynamic per-player scouting cards generated for key matchups. Each card includes a data-driven performance report, tonight's projected stat line (points, rebounds, assists range), and a hot/cold/neutral trend indicator based on recent form.

Take Verdicts: A media take fact-checker. Users submit or browse hot takes from sports media, and CourtVision's AI analyst returns a steelman (the strongest stat supporting the take), a challenge (the strongest counter-stat), and a verdict: Backed by data, Partially supported, an AI basketball analyst you can ask anything, from last night's box scores to MVP race analysis.

How We Built It

Frontend: Next.js with Tailwind CSS, using the Vercel AI SDK (ai/react) for streaming chat. The UI is built around a dark, high-contrast sports aesthetic with orange accent colors.

Backend: FastAPI (Python) with async endpoints, deployed with Uvicorn.

Machine Learning: XGBoost trained on two seasons (2023-24, 2024-25) of NBA game logs pulled from the nba_api. We engineered 24 features including rolling offensive/defensive ratings, pace proxies, rest differentials, home-specific and away-specific win percentages, and key differential features. We train three separate models: a home score regressor, an away score regressor, and a win probability classifier.

AI Layer: Groq used for player card generation, take verdicts, and as a fallback when the ML model isn't confident. The chat interface streams responses using Groq API.

Data & Storage: Supabase (Postgres) for game history, player data, team data, prediction caching, and media takes. Upstash Redis for fast in-memory caching with local dict fallback for development.

Data Pipeline: Custom bootstrap scripts to seed two seasons of historical game data and all active NBA players via nba_api, with a background scheduler to sync live game data.

Challenges We Ran Into

SDK version fragmentation: The Google AI Python ecosystem has two completely incompatible SDKs (google-generativeai and google-genai) with different APIs, different async patterns, and different streaming interfaces. Ended up using Groq instead.

Hardcoded features masking model quality: Our offensive rating, defensive rating, and pace features were accidentally hardcoded to constants (115.0, 112.0, 98.5) for every team and every game. The model was training on six features that were identical for all rows, producing predictions that were almost random. Accuracy was stuck at 59% until we replaced them with real rolling averages.

ID mismatches in mock data: Our mock games used informal string IDs ("lakers", "warriors") while historical data used NBA numeric IDs ("1610612747"). The feature extractor found zero historical rows for every mock team, defaulting every prediction to 110.0 / 110.0 identical scores across all games. The cache layer then locked in these bad predictions and served them repeatedly.

Accomplishments That We're Proud Of

End-to-end ML pipeline from raw game logs to live predictions, trained on real NBA data and served in under 200ms with caching
Dual-mode prediction system — the ML model runs first, with Gemini as a graceful fallback, so the app always returns a high-quality prediction regardless of model availability
Streaming AI chat that correctly implements the Vercel AI SDK data stream protocol against a custom FastAPI backend.
Take Verdict feature: X’s Grok in a sports context, going beyond simple Q&A to actively fact-check media narratives with structured, citation-grounded responses
Getting from 0 to a fully functional, deployed sports analytics platform in hackathon time

What We Learned

Training ML models on sports data requires extreme care around ‘temporal ordering;’ the difference between a model that generalizes and one that cheats is often invisible until you check feature values manually
Silent fallbacks could be detrimental; the ML model was never running in production because of a function signature mismatch, and nothing in the logs indicated it; explicitly logging sources helped us diagnose it.
Streaming AI responses across a Next.js frontend and a Python backend requires understanding three layers: the AI SDK's wire protocol, HTTP streaming semantics, and Python's async generator model — a mistake at any layer produces a silent failure
XGBoost responds much better to differential features (home_win_pct - away_win_pct) than to the raw values separately; adding these six features improved accuracy more than doubling the number of estimators

What's Next for CourtVision

Injury-adjusted predictions : Integrating real-time injury reports to adjust lineup strength before tip-off. A team missing its starting center has a fundamentally different defensive profile.

Live in-game updates: Shift predictions dynamically as games progress, updating win probability after each quarter based on actual score and stats.

Elo ratings: Replace simple win percentage with a proper elo system that accounts for opponent strength, giving a much more accurate picture of team quality than raw record.

User accounts and prediction tracking : Let users follow their own prediction record over a season and compare against the model.

Expanded to NFL, NCAAB and other sports in general: The architecture is sport-agnostic; the feature engineering and model pipeline can be adapted to other leagues with new data sources.