Kalshi Watchdog — Prediction Market Surveillance Platform
Inspiration
Prediction markets are one of the most exciting developments in modern finance. Kalshi, the first CFTC-regulated prediction exchange, lets you trade on real-world events — elections, weather, economic data, even whether the Fed will cut rates. But with real money on the line and markets tied to world events, a natural question emerges: what does insider trading look like in a prediction market?
Traditional stock exchanges have decades of surveillance infrastructure. Prediction markets have almost none. We saw an opportunity to build a system that could catch the kinds of patterns regulators look for — volume spikes before resolution, coordinated trading bursts, and suspiciously well-timed bets on longshot outcomes — and make the results accessible through a modern dashboard with AI-generated case narratives.
The inspiration also came from real incidents: a MrBeast editor betting on video outcomes before release, a California governor candidate's associates positioning before endorsement announcements, and large coordinated bets appearing hours before military operations were publicly announced. These aren't hypotheticals — they've already happened.
What it does
Kalshi Watchdog ingests market and trade data from the Kalshi API, runs three anomaly detection algorithms, enriches findings with AI analysis (Claude 3 Haiku via AWS Bedrock), and presents everything through an interactive React dashboard.
Three Detection Algorithms
Volume Spike Detection — Bins trades into hourly buckets and flags hours where volume exceeds mean + n * standard deviation (default n = 3) across the full market history. A z-score of 4+ triggers HIGH severity. This catches unexplained pre-resolution volume surges.
Coordinated Activity Detection — Identifies bursts of trades within 5-minute clusters with high directional consistency. Calculates z-scores relative to 15-minute baseline windows. This is the signature of algorithmic order splitting or synchronized trader action.
Golden Window Detection — The most sophisticated detector. Flags trades on extreme-probability markets (YES < 5 cents or NO < 5 cents) placed within 48 hours of resolution. A scoring function combines:
Score = P(correct outcome) x Volume x (1 / time-to-resolution)
CRITICAL severity triggers when z > 4, sub-10% odds, $10K+ notional, and < 12 hours to resolution — the textbook insider trading profile.
AI-Powered Analysis
Each flagged anomaly is sent to Claude 3 Haiku via AWS Bedrock with full context: market title, anomaly type, trade count, volume, probability, hours-to-resolution, and z-score. Claude returns a structured JSON analysis with a summary, reasoning chain, severity assessment, and possible explanations. If Bedrock is unavailable, the system falls back to heuristic analysis — detection never depends on the LLM.
Dashboard Features
- Force Graph — Network visualization connecting anomalies to markets and historical insider-trading case parallels
- Candlestick Charts — OHLC price movement per market with anomaly overlay
- Orderbook Depth — Real-time bid/ask spread visualization
- Anomaly Timeline — Scatter plot of detections by time and severity
- Market Inspector — Full trade-level drill-down with flagged context
- Watchlist — Personal market tracking with category browsing and anomaly badges
- Known Cases — 6 real/fictional insider-trading parallels for regulatory context
- Admin Dashboard — User management, request analytics, and usage metrics
How we built it
Dual-Mode Architecture
The most interesting architectural decision was building a single codebase that runs both locally and on AWS with zero code changes. A single environment variable (STORAGE_BACKEND) controls the entire storage layer:
| Layer | Local Mode | AWS Mode |
|---|---|---|
| Storage | SQLite (WAL mode) | DynamoDB (6 tables) |
| Real-time push | Server-Sent Events (SSE) | WebSocket via API Gateway v2 |
| Pipeline | Manual (frontend buttons) | Step Functions + EventBridge (every 2h) |
| Auth | Bypass mode (user_id = "local") |
AWS Cognito (JWT) |
Every storage function dispatches at runtime — batch_write_trades, get_anomalies, add_to_watchlist — all route to SQLite or DynamoDB through a clean abstraction in utils/dynamo.py. The local dev server (local_api.py) wraps the same Lambda handler in a ThreadingHTTPServer, so the exact same code path executes locally and in production.
AWS Services (13 total)
| Service | Purpose |
|---|---|
| Lambda | 8 functions: API, market/trade ingestion, detection, analysis, WebSocket connect/disconnect/broadcast |
| DynamoDB | 6 tables: Trades, Markets, Anomalies, Connections, Watchlist, Usage |
| DynamoDB Streams | Triggers WebSocket broadcast on new anomaly inserts |
| API Gateway | REST (HTTP routes) + WebSocket v2 (real-time push) |
| Step Functions | Orchestrates: Ingest Markets → Ingest Trades → Run Detection |
| EventBridge | Scheduled rule (every 2 hours) triggers the pipeline |
| Bedrock | Claude 3 Haiku for AI anomaly analysis and market explanations |
| S3 | Raw data archival with 30-day STANDARD_IA lifecycle |
| Cognito | User Pool with email auth, admin groups, JWT authorization |
| Amplify | Frontend hosting with CI/CD from GitHub |
| CloudWatch | Custom dashboard with Lambda metrics and DynamoDB capacity |
| X-Ray | Distributed tracing across all Lambda functions |
| SNS | CRITICAL anomaly alert notifications |
Frontend
React 18 + TypeScript + Vite, styled with Tailwind CSS and animated with Framer Motion. Recharts handles all charting (candlesticks, timelines, breakdowns) and react-force-graph-2d powers the network visualization. Authentication flows through AWS Amplify's Cognito integration with protected routes and admin-only views.
Kalshi API Integration
The Kalshi client uses RSA-signed authentication — each request includes a timestamp, method, and path signed with a private key:
signature = PKCS1v15(SHA256(timestamp || METHOD || path))
The key loads from a file path locally or from a base64-encoded environment variable in Lambda, so private keys never touch version control.
Challenges we faced
1. Trade Ingestion Sequencing
The trade ingestion endpoint queries for settled markets first, then fetches trades per market. But locally, if you hit "Ingest Trades" before "Ingest Markets," the query returns zero markets and the whole pipeline silently produces nothing. We solved this with auto-market ingestion — when the trade handler detects no settled markets, it internally calls the market ingestion handler first, then retries.
2. Local/Lambda Import Compatibility
Lambda expects flat imports (from utils.kalshi_client import ...) but running locally with python -m backend.local_api puts the wrong directory on sys.path. The local server now injects backend/ onto sys.path at startup so Lambda-style imports resolve correctly in both environments.
3. DynamoDB Reset Operations
The initial reset endpoints only worked with SQLite (just delete rows). DynamoDB requires scanning the entire table to get all keys, then batch-deleting them — a fundamentally different operation. We had to implement scan-and-batch-delete for all 6 tables.
4. Real-Time Push Across Two Transports
Supporting both SSE (local) and WebSocket (AWS) from a single frontend component required careful abstraction. The LiveDetectionStream component checks for a WebSocket URL and falls back to SSE, with reconnection logic for both.
5. Golden Window False Positives
Early versions of the golden window detector flagged every cheap market near resolution. We refined the scoring function to weight probability, volume, and time-to-resolution together, and added minimum notional thresholds (\$5K for HIGH, \$10K for CRITICAL) to filter noise.
What we learned
- Dual-mode architecture pays off — being able to iterate locally with SQLite and deploy to DynamoDB with zero changes made development dramatically faster
- DynamoDB Streams are powerful — automatic event-driven WebSocket broadcasting with no polling or queuing infrastructure
- AI enrichment vs. AI dependency — Claude 3 Haiku adds narrative context but the detection algorithms stand alone; this separation keeps the system reliable
- Prediction market surveillance is an open problem — there's very little existing tooling for this; the regulatory frameworks are still being written
- SAM + Step Functions make serverless orchestration manageable — the pipeline runs every 2 hours with no servers to maintain
What's next
- Live market monitoring — currently processes settled markets; extending to open markets with streaming trade data
- Multi-user pipelines — per-user detection configurations and custom alert thresholds
- Expanded detection algorithms — wash trading detection, account clustering, cross-market correlation
- Regulatory reporting — exportable case files with full evidence chains for CFTC-style submissions
- Mobile alerts — push notifications via SNS when CRITICAL anomalies are detected
AI Tools Used
This project was built with assistance from AI coding tools:
- Claude (Anthropic) — Primary development partner via Claude Code CLI for architecture design, full-stack implementation, debugging, and this writeup
- OpenAI Codex — Code generation and iteration assistance
- Kiro (AWS) — AI-powered IDE for AWS infrastructure development
- Claude 3 Haiku (AWS Bedrock) — Powers the in-app AI anomaly analysis and market explanation features
Built with
Python, TypeScript, React, AWS Lambda, DynamoDB, API Gateway, Step Functions, EventBridge, AWS Bedrock (Claude 3 Haiku), Cognito, Amplify, S3, CloudWatch, X-Ray, SNS, Tailwind CSS, Recharts, Framer Motion, Vite, SAM
Built With
- amplify
- api-gateway
- aws-bedrock-(claude-3-haiku)
- aws-lambda
- cloudwatch
- cognito
- dynamodb
- eventbridge
- framer-motion
- python
- react
- recharts
- s3
- sns
- step-functions
- tailwind-css
- typescript
- vite
- x-ray
Log in or sign up for Devpost to join the conversation.