Ballboy
Inspiration
Lower-league soccer clubs make tactical decisions with no data. Premier League clubs have 10-person analyst teams and Opta contracts worth hundreds of thousands. The Championship coach with a staff of three watches the same match as everyone else, making gut decisions in real time.
We wanted to build their analyst staff.
What it does
Ballboy watches any live match through your screen and surfaces tactical insights continuously. Every 15 seconds, five AI models fire simultaneously:
- Gemini 2.0 Flash reads the broadcast frame and extracts minute, score, and tactical context
- YOLOv8s + ByteTrack detects and tracks players at 13fps via a calibrated overlay
- 3 specialist analysts in parallel via Wafer defensive, attacking, and physical, each generating a specific tactical observation
- Poisson math computes goal probability, substitution likelihood, and momentum from real StatsBomb xG data
- A synthesizer call combines all three analyst outputs into one actionable coach alert
The three analysts complete simultaneously in under 1 second. Full synthesis typically takes 1.5 to 2 seconds. That loop runs continuously for 90 minutes.
How we built it
Vision layer
Gemini 2.0 Flash reads the broadcast frame every 3 seconds, extracting minute, score, visible events, and tactical context. YOLOv8s with ByteTrack runs a calibrated PyQt5 overlay for player detection at 13fps. Players are shown in a unified color to consolidate overall movement patterns — team classification from broadcast footage is unreliable due to camera movement and lighting.
Inference layer
Three specialist analysts fire in parallel via Wafer. Running them sequentially would take 6 to 8 seconds per cycle, consuming the entire update window before synthesis even runs. Parallel brings it under 1 second combined — that architectural choice is what makes real-time continuous analysis possible.
Predictions layer
Pure Poisson math. No LLM output for predictions. Goal probability is calculated as:
$$P(\text{at least one goal}) = 1 - e^{-\lambda \cdot t}$$
where \(\lambda\) is the team's xG rate (from StatsBomb open data) adjusted for current possession, and \(t\) is minutes remaining. Substitution probability uses an empirical distribution derived from real match data, where probability peaks at minutes 60 to 65 and 72 to 75.
Data layer
StatsBomb open data pre-loaded at startup via prematch.py. SQLite for historical match patterns. API-Sports for real lineups and match events.
Frontend
Two-panel Electron overlay. Left panel shows the full lineup with position-based energy depletion — wingers deplete faster than defenders, grounded in real sports science. Right panel shows the momentum bar (driven by live ball position), coach alerts, prediction bars with formula steps visible, and parallel simulation mode showing probable match futures.
Challenges
Making inference speed genuinely load-bearing rather than cosmetic. The same three analyst calls that take 6 to 8 seconds sequentially take under 1 second in parallel via Wafer. That gap is the product.
Broadcast computer vision is harder than expected. Camera pans, zooms, and cuts make reliable tracking difficult. We built a PyQt5 calibration system that lets you manually define the exact video region on screen, making detection accurate regardless of layout or resolution.
What we learned
Parallel inference architecture fundamentally changes what's possible. We understood the value of speed being what determines whether a multi-agent loop can run fast enough to be useful in real time.
What's next
- Homography-based player position mapping for accurate top-down tactical view
- Live API integration for real-time in-game statistics
- Expansion beyond soccer: the vision-to-inference loop is sport-agnostic (basketball, tennis, water polo)

Log in or sign up for Devpost to join the conversation.