NewsQuantX — Agentic Macro AI on TiDB

Elevator pitch
When CPI, NFP, or a Fed statement drops, markets move in seconds but most traders only see raw numbers or 40-page PDFs. By the time they’ve read and reacted, price has often already moved. We built NewsQuantX to fix that: an agentic macro AI that ingests raw news, finds historical analogs, merges positioning data, and returns concise, confidence-scored trading signals and analyst-style narratives instantly.


Inspiration

Markets move in seconds when major economic reports drop CPI, NFP, Fed statements yet most traders only see raw numbers or 40-page PDFs. By the time they digest the information, price has already moved. Professional desks rely on expensive terminals with built-in analytics, while retail traders are left behind.

We asked ourselves: What if anyone could instantly understand what the Fed just said, or how a CPI miss compares to the last 10 years? That idea became NewsQuantX an agentic AI macro analyst that takes raw news, historical analogs, and positioning data, and delivers concise, confidence-scored signals that traders can actually use.

Along the way we discovered another hard truth: clean, consistent macro datasets and high-quality historical documents are surprisingly painful to assemble. That data work became a core part of the product and a competitive advantage for the demo.


What it does

NewsQuantX turns complicated macro data into trader-ready answers and signals:

  • AI Chat Assistant with Tool-Calling Ask natural queries like “last 5 CPI misses” or “what did Powell say last press conference?” and the agent chooses the right tool (vector search, SQL lookup, signal generator) and streams a structured response.
  • Event Clustering Awareness Groups intra-day events by country and type, selects a primary, and analyzes the whole cluster together (with actual/forecast/previous, surprises, revisions).
  • Event Detail See actual/forecast/previous, a short LLM analyst narrative, top vector analogs (KNN), and median price reactions (1h / 4h / 1d).
  • Embedded Documents We scrape and embed FOMC statements, minutes, projections (SEP), press releases, and central bank communications. These are vector-stored in TiDB and searchable in chat or linked to relevant events.
  • Signals Each event produces a per-asset score, confidence, and JSON drivers (why the signal fired), persisted in TiDB for reuse.
  • Market Regime Detection COT (Commitment of Traders) positioning and FRED macro series (CPI, GDP, labor, recession indicators, VIX) are collected and stored in TiDB, feeding into the regime engine that classifies tightening vs easing, expansion vs contraction.
  • Dashboards Rates, yield curves, inflation, labor, positioning, plus COT dashboards (managed vs commercial cohorts) and Fed SEP dot plots with historical deltas.

How we built it

  • TiDB Cloud (HTAP + Vector) single store for events, doc chunks, embeddings, intraday prices and persistent signals (enables fast hybrid vector + SQL queries).
  • FastAPI + Uvicorn Backend Agent exposes callable tools (get_event_signals, search_docs, get_event_analogs, list_events_by_slug). The agent streams results and ends with structured @@BLOCKS@@ JSON so the UI can render deterministic components.
  • Next.js + Tailwind + shadcn/ui streaming chat UI, event pages, and dashboards.
  • OpenAI text-embedding-3-small for vectors and gpt-4o-mini for streaming summarization and analysis.
  • Orchestrator combines quantitative features (A/F/P beat/miss, surprise metrics, COT z-scores, FRED regime) with qualitative doc context into a single, explainable signal object.

Crucial detail: the agent is extensible. As we add more tools (broker order flow, alternative data, backtests), the agent gains new capabilities without changing the UX judges can see this architecture live in the demo.


Challenges we ran into

  • Data sourcing & hygiene: locating, normalizing and backfilling clean macro prints and official Fed docs took significant effort.
  • Hybrid search: fusing vector KNN and SQL fast-paths in TiDB required careful schema design to remain fast and relevant.
  • Multi-agent coherence: ensuring the summarizer, analog finder, and signal generator don’t contradict each other needed prompt engineering + strict structured outputs.
  • Streaming + structured UI: building a smooth, low-latency user experience while preserving deterministic JSON blocks for the frontend.

Accomplishments we’re proud of

  • A production-ready, judge-friendly demo with hosted frontend + API and a read-only, preloaded TiDB dataset.
  • A powerful, tool-calling agent that performs real tasks (SQL lookups, vector analogs, signal scoring) and grows stronger as new tools are added.
  • Persistent, explainable signals that can be consumed by bots, backtests, or dashboards.
  • A trader-grade UI (COT, SEP, event analytics) that rivals commercial tools while remaining lightweight and reproducible.

What we learned

  • Clean data is as valuable as model tuning ingestion and schema design make or break a production analytics product.
  • Structuring AI output (JSON drivers + scores) yields far better downstream utility than freeform text.
  • An agentic, tool-calling architecture scales: adding tools adds capability without rewriting the UX.

What’s next

  • Add broker order-flow and exchange data as callable tools to improve signal fidelity.
  • Expand to equities, crypto, and fixed income.
  • Mobile push alerts and a community marketplace for shared strategies and backtests.

TiDB Cloud Account:

davidgurung2018@gmail.com

Built With

  • fastapi
  • next.js
  • python
  • tidb
  • uvicorn
Share this project:

Updates