Crypto Hedge Fund — Autonomous Multi-Agent Trading Platform

A hackathon project: a fully autonomous, multi-agent crypto trading fund where Claude plays the role of every research analyst and decision-maker in a real hedge fund's org chart — running live, 24/7, on real market data, in paper-trading mode. Crypto was selected as the asset of choice because the competition occurred on a weekend when most other financial markets were closed. For diversity, more asset classes will be added during the coming trading week

1. The Pitch

We built a hedge fund that runs itself. Instead of one model "talking about" trading, we modeled a real fund's org chart as a pipeline of specialized Claude agents — a Sentiment Analyst, an On-Chain Analyst, a CIO, a Portfolio Manager, a Risk Manager, a Compliance Officer — each with a narrow mandate, its own system prompt, and no authority outside its lane. They hand off structured JSON, not prose, to each other every 5 minutes, 24/7, against live Coinbase/OKX market data. Hard risk limits (drawdown halt, daily-loss halt, exposure caps, kill switch) are enforced in deterministic Python code, never by an LLM — the agents can recommend, but they cannot override a hard stop. It's currently live, paper-trading 10 crypto assets, with a real-time dashboard showing every signal, decision, and trade as it happens.

2. The Problem We're Solving

LLM trading demos are usually a single prompt: "here's some data, what should I buy?" That's a toy. A real fund has:

Specialized roles — a macro strategist doesn't size positions, a risk officer doesn't generate ideas, a compliance officer doesn't take market views.
Hard, non-negotiable limits — no fund lets a model freelance past a drawdown limit because it "felt confident."
Auditability — every dollar moved has to trace back to why.

We wanted to know: can you actually architect a system of LLM agents that respects those constraints — where the boundaries of authority are as important as the intelligence — and have it run unattended, continuously, against real data?

3. System Architecture

3.1 The Agent Pipeline (runs every 5 minutes, fully automated)

┌─────────────────────────── PHASE 1: INGESTION (parallel) ───────────────────────────┐
│  Market Data (Coinbase)   │  News Sentiment (NewsAPI)   │  On-Chain (OKX)            │
└───────────────────────────┴──────────────────────────────┴────────────────────────────┘
                                        │
┌─────────────────────────── PHASE 2: RESEARCH (parallel) ─────────────────────────────┐
│  Momentum Analyst          │  Sentiment Analyst (Claude) │  On-Chain Analyst (Claude) │
│  (deterministic: RSI,      │  reads headlines, scores    │  reads funding rates/OI/   │
│  MACD, Bollinger Bands)    │  conviction per asset       │  liquidations              │
└───────────────────────────┴──────────────────────────────┴────────────────────────────┘
                                        │
                              SignalBatch (typed JSON)
                                        │
┌─────────────────────────── PHASE 3: DECISION CHAIN (sequential) ─────────────────────┐
│  CIO (Claude)         →  Portfolio Manager  →  Risk Manager   →  Compliance          │
│  reads ALL signals,      (Claude) proposes     (deterministic,   (deterministic,     │
│  sets market regime &    trades, sized by      code-enforced     code-enforced rule  │
│  posture multiplier      CIO's posture         hard limits)      book checks)        │
└───────────────────────────┴──────────────────────────────┴────────────────────────────┘
                                        │
                              Execution → Paper Fill → Portfolio State Update
                                        │
                         Broadcast to live dashboard (WebSocket)

The key design decision: research and ideation are LLM-driven; capital preservation is not. A model can suggest a trade. It cannot approve its own trade, size it without a formula, or override a drawdown halt — those are plain Python.

3.2 Hard Risk Rules (code, not prompts)

Rule	Trigger	Action
RISK_001	Drawdown from session high > 5%	Kill switch, force-close all positions
RISK_002	Daily loss > 3%	Halt new positions for the rest of the day (sticky, survives P&L recovery)
RISK_003	Single position loss > 2% of portfolio	Block adding to the loser
RISK_004	Single-asset allocation > 20%	Resize down to the cap
RISK_005	Total long exposure > 80%	Reject additional longs
RISK_006	Kill switch active	Reject everything

Plus a separate 6-point compliance rule book (macro-event blackout windows, exchange status checks, minimum liquidity, duplicate-order prevention, paper-trading confirmation, market-impact caps) — all deterministic, zero LLM involvement.

3.3 Tech Stack

Layer	Technology
Agents / LLM	Claude Sonnet 4.6, Anthropic Python SDK
Backend	Python 3.12, FastAPI, asyncio
Scheduling	APScheduler (5-min tick loop)
Data	TimescaleDB (OHLCV time-series), Redis (live state, signal cache)
Market data	`ccxt` → Coinbase (public, unauthenticated)
On-chain data	OKX public API (funding rate, open interest, liquidations)
News sentiment	NewsAPI.org
Frontend	React 18 + TypeScript + Vite, Tailwind, Recharts, Zustand, React Query
Live updates	WebSocket push from backend → dashboard
Testing	pytest, 232 tests, all passing

4. What Makes This Non-Trivial (the engineering, not just the prompt)

A few things that bit us — and the fixes — are worth highlighting to judges, because they show this is a system, not a demo wrapper:

LLMs don't reliably follow "JSON only." Every analyst occasionally wraps its output in markdown fences, or adds explanatory prose before/after the JSON, despite being told not to. We built a robust extractor (strip_json_fences) that scans for the actual JSON value anywhere in the response and discards the rest — applied uniformly across all four LLM-output parsing sites.
A naive "fallback to safe defaults on any parse failure" silently breaks the system. Early on, the CIO agent was always defaulting to its most conservative regime — not because the market was bad, but because minor formatting hiccups kept triggering full fallback, discarding the model's real (and often good) judgment. We rebuilt the recovery path to normalize field-name variations and salvage the model's actual reasoning instead of nuking it.
Paper-trading accounting bugs are real bugs. Our first implementation tracked positions by USD cost basis only — selling a position settled at the entry price, not the market price, silently destroying every realized gain or loss. We caught this with a deliberate buy → price-move → sell regression test and rebuilt the engine around base-asset quantity tracking (the correct way to model a position).
Free data sources don't replace paid ones cleanly. Our on-chain data provider (CoinGlass) turned out to gate funding rate and open interest behind a paid plan we didn't have — it returned HTTP 200 with a "upgrade your plan" message disguised as data. We migrated to OKX's free public market-data API and verified real funding rate / open interest / liquidation data end-to-end.
A safety-critical check that vanishes under -O. The "never place a live order" gate was originally a bare assert — Python silently strips those when run with optimizations enabled. Replaced with explicit, non-strippable exceptions.

We treat this list as a feature, not a confession: a hackathon project that ships with a real bug backlog and fixes is more credible than one that claims it "just worked."

5. Live Demo — Real Numbers (as of this build)

232 automated tests, all passing, covering every agent boundary and risk rule
58 ticks completed autonomously today, 155 real signals generated, zero manual intervention
10 crypto assets tracked end-to-end: BTC, ETH, SOL, BNB, XRP, ADA, AVAX, DOT, POL, LINK
Currently regime: RANGING / posture NEUTRAL — the CIO has been correctly conservative because signals haven't corroborated across analysts yet (the system is designed to wait for agreement rather than force a trade)
Full audit trail: every signal, regime call, proposed trade, risk decision, compliance check, and fill is logged with a complete provenance chain

Dashboard Pages

Overview — live portfolio value, P&L, drawdown, kill-switch button (with a type-to-confirm modal), agent health strip
Signal Feed — live stream of every signal as it's generated, filterable by asset/analyst/confidence
Trade Log — every trade's full lifecycle (proposed → approved/rejected → filled) with expandable provenance
Backtesting — replay the full agent pipeline against historical data, with Sharpe/Sortino/drawdown/win-rate metrics and equity curve charts
Agent Monitor — per-agent latency, error counts, live activity log, current CIO regime

6. Safety Posture

Paper trading only, enforced in code at two separate layers (execution agent + paper engine), not just configuration — and the check can't be silently disabled.
No code path anywhere in the system is capable of submitting a real order; market data fetches are unauthenticated/read-only even where API keys exist.
Kill switch is checkable three ways (file flag, Redis flag, API call) and once triggered cannot be cleared without a manual, deliberate reset.

7. What We'd Build Next

Persist full signal/trade provenance to TimescaleDB (currently Redis/WebSocket only — survives a restart, but isn't queryable historically yet)
Unify the backtest and live execution code paths so they can never silently diverge
A second on-chain data source for redundancy now that we've proven the pattern with OKX
Real (not paper) execution behind an explicit, separately-gated feature flag — by design, currently impossible without a deliberate code change
Add more asset classes to diversify away from crypto now that markets are reopening for trading

Built With

claude
docker
python
react
redis
timescaledb
typescript

Updates

Alan Talavera started this project — Jun 21, 2026 01:18 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.