Inspiration

Most prediction market bots treat every market the same — feed a question to an LLM and trade on whatever it says. We noticed that LLMs are poorly calibrated without grounding data, so we built a bot that only trades where it has a real information edge.

What it does

Prophet Bot is an automated prediction market trading agent. It classifies each market by domain (sports, economics, weather, politics), fetches domain-specific external data, and uses a two-layer AI system to decide whether to trade:

  1. Scout (Gemini 2.5 Flash) — fast, cheap probability estimate grounded in external data
  2. Judge (GPT-5.2) — slower, more careful review that gates every trade

The bot only places a trade when both models agree there's an edge above a calibrated threshold.

How we built it

  • Data pipelines: The Odds API for bookmaker consensus odds (including outright championship futures), FRED for economic indicators, Open-Meteo for weather forecasts, Tavily for real-time news
  • Calibration: Shrinkage toward market prices with per-domain weights tuned on 2,000+ resolved markets. High-probability dampening to correct model overconfidence.
  • Risk management: Variance-aware Kelly criterion sizing, per-market/team/game/league correlation limits, mutual exclusivity checks (can't bet YES on two teams winning the same championship), stop-loss and edge-scaled take-profit exits
  • Architecture: Tick-based loop using the ai-prophet SDK, deployed on Railway

Challenges we ran into

  • The market universe was 256 markets but ~95% were long-dated politics with no resolution in the eval window. We had to maximize edge on a handful of sports markets.
  • The bot initially bought contradictory positions (Arsenal YES and Man City YES for the EPL) because family deduplication only worked within a single tick. We added cross-tick mutual exclusivity tracking.
  • Bookmaker match odds were being confused with championship odds — the model saw "Man City 73% to beat Aston Villa" and inferred 40% to win the league. We added outright futures odds fetching to give the model the right signal.

What we learned

  • Data edge > model sophistication. Bookmaker odds are incredibly well-calibrated; anchoring on them beats letting an LLM guess.
  • Calibration matters more than raw accuracy. Our shrinkage toward market prices prevented most overconfident trades.
  • In a sparse market universe, capital efficiency and selectivity matter more than volume.

Built With

Share this project:

Updates