Polybot

Inspiration

Most retail traders lose money because they chase confirmation bias — they find one bullish signal and ignore everything else. Professional trading desks avoid this
with structured debate: analysts pitch, risk managers challenge, and backtests ground-truth the thesis. We wanted to bring that same adversarial rigor to individual
investors using multi-agent AI.

What it does

Polybot runs six specialized AI agents in a cyclic pipeline to analyze any stock ticker. A Market Context agent pulls real price data and technicals, a Sentiment
agent classifies news headlines, an Alpha Generator proposes a trade, a Devil's Advocate tears it apart, a Backtest Validator checks if similar setups have historically worked, and a Risk Manager sizes the position using Kelly criterion math. If the proposal is weak, the pipeline loops back for revisions — up to twice
before forcing a HOLD. Every agent's reasoning streams to a split-pane UI in real time so you can watch the debate unfold. Approved trades execute as bracket orders on Alpaca's paper trading platform.

How we built it

The backend is a FastAPI server orchestrating agents through LangGraph with cyclic conditional edges. Google Gemini (via Vertex AI) powers the reasoning agents with
structured output schemas so responses are typed, not parsed from free text. Market data and execution flow through Alpaca's API. The backtest engine uses k-means clustering and KNN pattern matching over historical bars replayed through vectorbt. The frontend is a Next.js app with React Flow for the agent graph visualization,
WebSocket streaming for live token output, and a split-pane "Cross-Examination Terminal" showing the Alpha vs. Adversarial debate side by side.

Challenges we ran into

Getting the Adversarial agent calibrated was the hardest part. Too aggressive and it vetoes everything — too lenient and it becomes a rubber stamp. We had to tune the system prompt, add deterministic escalation rules (e.g., earnings within the hold window automatically triggers a CRITICAL flag), and cap the revision loop to prevent infinite debates. Backtest validity was another challenge: ensuring strict point-in-time data filtering so agents can't accidentally peek at future prices
during evaluation, and handling tickers with insufficient historical matches gracefully.

Accomplishments that we're proud of

In our evaluation framework, Polybot's approved BUY signals averaged +2.45% excess return over the S&P 500, with the Risk Manager correctly rejecting 75% of proposals that didn't meet its criteria — proving the adversarial architecture works as intended. The system rejected trades for concrete, auditable reasons (stop too tight relative to ATR, insufficient backtest matches, negative Kelly fraction) rather than vibes. The full agent reasoning is transparent and streamable, so you never have to trust a black box.

What we learned

Structured disagreement produces better decisions than consensus-seeking. Forcing a dedicated adversarial agent into the pipeline caught edge cases that a single
monolithic LLM would have confidently missed. We also learned that position sizing math (Kelly criterion, portfolio heat caps) matters more than signal accuracy — a great signal with bad sizing still loses money. Finally, building evaluation infrastructure early (our run_eval backtester) was essential for iterating on agent
prompts with real feedback instead of guessing.

What's next for Polybot

Adding a live earnings calendar integration via Finnhub so the Adversarial agent can flag upcoming catalysts automatically. Expanding the backtest engine with more
similarity algorithms and a Parquet cache for faster historical lookups. Building a watchlist mode that runs the pipeline nightly across a user-defined set of tickers and surfaces only high-conviction opportunities. And eventually, once we've built enough confidence from paper trading results, exploring cautious live execution
with strict position limits.

Built With

fastapi
langgraph
python
react
typescript

Updates

Jachin Choi started this project — Apr 12, 2026 03:58 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.