Inspiration
Prediction markets like Polymarket and Kalshi are already running 2026 World Cup markets - but they're spreadsheets of numbers. Nobody talks to them. I wanted the opposite: a pundit you argue with out loud, who reasons from real data instead of vibes. So I built Oracle, a theatrical AI commentator you speak to, who answers in a voice, grounded in a model trained on 80 years of international football data.
What it does
You ask about a matchup - "Mexico versus South Korea" - and Oracle replies in seconds: live Win/Draw/Loss probability bars plus a spoken verdict in an authoritative pundit voice. You can feed it live game state ("Spain losing one-nil at sixty-seven minutes") and watch the probabilities flip in real time. Hit "Check the web" and it scrapes Polymarket and Google to show Oracle vs. Market vs. Web side-by-side, so you see where the model disagrees with the crowd.
How I built it
- Voice (Deepgram): Nova-3 STT with team-name keyterms to catch mishearings, Aura-2 TTS with sentence-level streaming so Oracle starts speaking in ~1s. Hands-free VAD mode for continuous listening.
- Prediction engine: Multinomial logistic regression on Elo + form features, trained on 49,437 national-team matches. 57.7% three-class accuracy on held-out data, and well-calibrated (predicted 40% → actual 39%). Live games use a Poisson in-play model scaled by remaining time.
- Analyst (Claude): Haiku 4.5 for low-latency voice replies, Opus 4.8 for richer web-search verdicts. A fact block is injected into the system prompt so the pundit is forbidden from inventing statistics - every percentage it speaks is grounded.
- Live intelligence (Browserbase): Headless CDP sessions scrape Polymarket odds and Google consensus for the model-vs-market comparison.
- Stack: FastAPI + SSE streaming backend, vanilla-JS web UI with Web Audio API.
Challenges I ran into
STT mishearings on multi-word nations ("Cape Verde Islands"), a nasty score-parsing bug ("80" being read as 8-0), keeping the voice loop fault-tolerant when a scraper times out, and Deepgram gotchas (sample_rate must be an int). I wrote 69 offline edge-case assertions that caught four demo-killing bugs before judging.
Accomplishments I'm proud of
A genuinely conversational pundit that's fast (speaks in ~1s), grounded (never hallucinates a stat), and calibrated - not a chatbot wrapper, but a real model with a voice. And it's something I'll actually use: I play soccer, follow the World Cup, and love predicting match outcomes against my friends - now I get to do it with this. It's wild that I can bring an idea like this to life in a weekend; little me would've been amazed.
What I learned
Grounding an LLM to a strict fact block is the difference between a credible analyst and a confident liar. And streaming TTS is what makes a voice agent feel alive.
What's next
Bracket simulation from pre-match probabilities, multi-language commentary (Aura supports it), and a Kelly-criterion edge finder against live market odds.
Built With
- anthropic
- browserbase
- claude
- css
- deepgram
- fastapi
- football-data.org
- html
- javascript
- numpy
- pandas
- playwright
- python
- scikit-learn
- server-sent-events
- uvicorn
- web-audio-api
Log in or sign up for Devpost to join the conversation.