Inspiration

Quantitative finance is no longer just about linear statistics; it has evolved into a high-stakes, data-driven domain where speed, security, and complex machine learning decide profitability. However, building, backtesting, and validating a quantitative strategy requires writing hundreds of lines of boilerplate code, managing shifting yfinance data structures, and manually running risk filters.

We built APEX Quant-Forge to bridge the gap between human intuition and institutional-grade execution. By orchestrating a collaborative, multi-agent network, we wanted to empower traders to speak their trading ideas in plain English, and watch as an autonomous network of specialized AI agents writes, validates, self-heals, refines, and tests those strategies against real market data—all in a secure, high-performance sandbox.


What it does

APEX Quant-Forge is an advanced, multi-agent algorithmic backtesting platform. The system operates through a structured state-machine pipeline:

  1. Intent Analysis & Parameter Extraction: Parses natural language prompts to extract tickers, strategy rules, indicators, and date ranges.
  2. AST Code Sanitation: Runs generated Python strategy scripts through an Abstract Syntax Tree (AST) validator to block unsafe commands (such as sys, os, or socket operations) before execution.
  3. Isolated Sandbox Execution: Runs the code in an isolated subprocess, fetching data, running vectorised backtests, and outputting trade metrics.
  4. Self-Healing Critic: If a script fails inside the sandbox, a recursive Critic node parses the traceback, identifies syntax/indentation errors, and auto-patches the code.
  5. Machine Learning Risk Filter: Dynamically trains a local Random Forest Classifier on technical features (RSI, MACD, Volatility) to classify trades, blocking high-risk entries as ML_FILTERED.
  6. Parameter Optimization: Evaluates holding periods and indicator windows, automatically rejecting underperforming parameters to maximize Sharpe Ratio.

How we built it

We engineered APEX Quant-Forge using a state-of-the-art developer stack optimized for performance and resilience:

  • Orchestration: Built on a state-driven agent network using LangGraph and LangChain.
  • Frontend: Responsive, dark-themed dashboard styled using custom CSS inside Streamlit.
  • Resilient Routing Proxy: Built a multi-key, 5-provider router supporting Groq, Gemini, Cerebras, OpenRouter, and Mistral with automatic fallback rotation to combat rate limits (429 errors).
  • Database: Persistent local trade tracking and run histories managed via SQLite.
  • ML Core: Implemented with Scikit-Learn for the Random Forest Classifier.
  • Math & Charts: Built with Pandas, NumPy, and interactive Plotly timelines.
  • Deployment: Containerized using a custom Dockerfile and deployed to Hugging Face Spaces.

Quantitative Formulations Used

We injected dynamic mathematical calculations into the sandbox output:

  • Sharpe Ratio measures risk-adjusted return: $$ Sharpe = \frac{\overline{R}_p - R_f}{\sigma_p} $$ Where \(\overline{R}_p\) is the mean asset return, \(R_f\) is the risk-free rate, and \(\sigma_p\) is the standard deviation of returns.

  • Sortino Ratio evaluates downside risk specifically: $$ Sortino = \frac{\overline{R}p - R_f}{\sigma_d} $$ Where \(\sigma_d\) is the downside deviation of negative returns: $$ \sigma_d = \sqrt{\frac{1}{N} \sum{i=1}^{N} \min(0, R_i - R_f)^2} $$


Challenges we ran into

  • Subprocess Startup Latency: Spawning python subprocesses for sandbox code execution originally took up to 25.5 seconds due to library import overhead (pandas, numpy, sklearn). We solved this by pre-warming a daemon process in memory, achieving a 25x acceleration in backtest turnaround.
  • API Rate Limiting (429s): High rate limits during testing would crash the workflow. We engineered a 12-key, 5-provider failover routing proxy that seamlessly rotates API keys when rate limits or timeouts are detected.
  • Structural Data Shift: Structural changes in the yfinance data format returned Multi-Indexed DataFrames that broke traditional slicing. We added a programmatic column-flattening step directly inside our Coder Agent instructions.
  • Indentation Errors in LLM Code: Code injected dynamically with mathematical stats frequently broke due to whitespace errors. We solved this by writing an AST parser and a rfind() code injector that safely finds the main block indentation level before appending metrics calculations.

Accomplishments that we're proud of

  • Robust Self-Healing Pipeline: Seeing the Critic agent capture a sandbox traceback, identify the exact broken line, patch the code, and successfully rerun the backtest without any user intervention.
  • Machine Learning Integration: Successfully integrating a local Random Forest classifier that dynamically gates trades in real-time, proving that AI filters can protect portfolio drawdown.
  • Aesthetic Excellence: Crafting a dark, cohesive dashboard theme that makes terminal streaming logs, ML feature importances, and interactive Plotly curves look institutional-grade.
  • Zero-Configuration Demo Mode: Implementing a relaxed API guard that lets users try the app instantly via a cached mock fallback system if no API keys are provided.

What we learned

  • AST Parsing is Crucial: Relying purely on LLM syntax instructions is not enough for sandbox safety. Enforcing strict AST validation is the only way to secure code execution.
  • Orchestration Trumps Size: Smaller, specialized models (like Llama 3.1 8B on Cerebras or Gemini 2.5 Flash) organized in a cooperative network (LangGraph) outperform single large models trying to do everything at once.
  • Pre-Warming Boosts UX: Users expect instant feedback. Pre-loading heavy data libraries in background memory turns a sluggish application into a premium, responsive experience.

What's next for APEX Quant-Forge AI Agent

  • Live Broker Integrations: Connecting to paper-trading APIs (like Alpaca or Interactive Brokers) to execute generated strategies in real-time.
  • Multi-Asset Portfolio Rebalancing: Moving beyond single-ticker testing to allow the agent to manage and rebalance a diversified index portfolio.
  • Higher Frequency Options & Crypto Connectors: Integrating WebSocket streams to capture sub-second options pricing data and high-volatility crypto markets.
  • Mobile Webhook Alerts: Building automated SMS/Telegram alert webhooks to notify users when the ML filter blocks a trade or when a strategy triggers exit conditions.

Built With

Share this project:

Updates