Inspiration

Every Polymarket trading bot uses the same three approaches — copy trading whales, chasing Twitter sentiment, or latency arbitrage. Meanwhile, on Wall Street, institutional traders have quietly used a technique called VPIN (Volume-synchronized Probability of Informed Trading) for over 15 years. It was literally used to detect the 2010 Flash Crash before it happened. We asked ourselves: why hasn't anyone brought this to prediction markets? Polymarket does over $1 billion in monthly volume — there's massive informed flow hiding in plain sight, and nobody is measuring it.

What it does

ToxFlow detects when "smart money" enters a Polymarket event and tells you which side (YES or NO) they're betting on. It does this by:

  • Volume bucketing trades into fixed-dollar buckets instead of time intervals, normalizing for market burstiness
  • Measuring VPIN — the imbalance between buy and sell volume that reveals informed trading activity
  • Directional VPIN (our innovation) — not just "toxicity is high" but "smart money is buying YES"
  • Composite signals — combining VPIN with Synthesis AI forecasts. When both agree, 2x confidence. When they disagree, we filter it out at 0.3x
  • Monte Carlo validation — running 100+ simulated markets to statistically validate the strategy, not just one cherry-picked backtest

The full-stack dashboard lets you tune parameters in real time, visualize VPIN spikes overlaid on price, see signal heatmaps, P&L curves, and trade logs.

How we built it

  • Core engine in Python with NumPy — volume bucketing, tick-rule trade classification, VPIN calculation, z-score spike detection
  • FastAPI backend serving backtest and Monte Carlo endpoints
  • React 19 + TypeScript frontend with Recharts for interactive VPIN charts, P&L curves, signal heatmaps, and Monte Carlo histograms
  • Synthesis API for real Polymarket market data — trade history, live prices, orderbooks, and WebSocket streams
  • Synthetic market generator for backtesting — realistic microstructure with noise trading, informed bursts at 3-10x volume, and gradual price discovery toward resolution
  • Toxicity momentum strategy with position sizing (3-8% of capital), profit targets (12%), stop losses (4%), and VPIN reversal exits

Challenges we ran into

  • Trade classification on Polymarket — the CLOB doesn't always explicitly label buy/sell side, so we had to implement the tick rule (inferring direction from price movement) and validate it against known patterns
  • Tuning VPIN parameters — bucket volume and window size dramatically affect signal quality. Too sensitive and you get false positives on noise; too conservative and you miss real informed flow. We solved this with Monte Carlo validation across randomized parameter ranges
  • Adapting VPIN from equities to binary outcomes — traditional VPIN assumes continuous price spaces. Prediction markets resolve to 0 or 1 with YES/NO tokens, requiring us to rethink how directional flow maps to outcomes

Accomplishments that we're proud of

  • First-ever application of VPIN to prediction markets — bringing a proven institutional technique to an entirely new asset class
  • Directional VPIN — our novel extension that tells you not just that informed trading is happening, but which side it favors
  • Statistical rigor — Monte Carlo validation across 100 markets with randomized parameters, not just one backtest. Our best configuration achieved a 1.28 profit factor with +$3,157 profit on $10,000 capital
  • Full-stack interactive dashboard — parameter tuning, real-time visualization, and trade-level transparency, all in the browser

What we learned

  • Market microstructure techniques from traditional finance are massively underexplored in prediction markets — there's a real alpha gap
  • Volume-synchronized analysis is fundamentally better than time-based analysis for bursty markets like Polymarket where trading activity is highly uneven
  • Combining quantitative signals (VPIN) with AI forecasts creates a powerful filter — the agreement/disagreement multiplier eliminated a huge number of false positives
  • Monte Carlo validation is essential — single backtests are misleading, and parameter sensitivity can make or break a strategy

What's next for ToxFlow

  • Live trading integration — connecting to Polymarket's CLOB for real-time order execution when VPIN spikes are detected
  • Multi-market monitoring — scanning dozens of active markets simultaneously and ranking them by toxicity levels
  • Smart wallet clustering — expanding our wallet tracker to build profiles of informed traders across markets and weighting their flow in real time
  • Additional microstructure metrics — adding Kyle's Lambda (price impact), Amihud illiquidity ratio, and order flow imbalance to create a richer signal composite
  • Public API — exposing ToxFlow's toxicity readings as an API so other builders can integrate orderflow intelligence into their own strategies

Built With

Share this project:

Updates