Inspiration

Prediction markets are powerful, but how would an AI actually perform if it traded them in real-time? We wanted to answer this without the bias of hindsight. The challenge: build a time machine that lets us replay history and watch an LLM make decisions with only the information available at each moment.

What it does

LMT simulates AI trading on prediction markets by reconstructing the exact information environment at 5-minute intervals throughout history. It captures market odds and relevant news, feeds them to an LLM, and executes the model's trading logic through Daytona. The system identifies the exact moments when and why the AI changed its position, visualizing confidence vs. market odds over time.

How we built it

  • Backend: FastAPI orchestrator managing agent workflows
  • Frontend: Next.js dashboard with Recharts for timeline visualization
  • AI Layer: Daytona executes LLM-generated code for rigorous decision logic
  • Data Pipeline: Kalshi API for market data, Exa for historical news search
  • Core Innovation: Time-travel simulation that prevents information leakage from future events

Challenges we ran into

Reconstructing clean historical snapshots proved difficult—ensuring the AI never "sees" future information required careful timestamp management. Coordinating between multiple APIs (Kalshi, Exa, OpenAI/Anthropic) while maintaining temporal consistency was complex. We also had to design a breaking point analysis that could trace decisions back to specific information triggers.

Accomplishments that we're proud of

We built a genuine time machine for AI decision-making. The system provides audit trails showing exactly what information caused strategy changes. The interactive visualization makes complex temporal trading patterns immediately comprehensible.

What we learned

Backtesting AI agents is fundamentally different from traditional backtesting—information leakage is the enemy. We learned how to architect agentic systems that execute generated code safely. Most importantly, we discovered that AI trading decisions are highly sensitive to information timing, not just content.

What's next for LLM Prediction Market Backtest

Expand to multiple prediction markets simultaneously. Add strategy optimization where the system learns from what worked. Implement real-time mode that transitions from backtest to live trading. Build comparison tools to benchmark different LLM models and prompting strategies.

NOTE

Unfortunately github is done right now. So not all changes have been pushed to github.

Built With

Share this project:

Updates