Inspiration

Global financial markets move trillions of dollars based on subtle shifts in central-bank language. When the Federal Reserve or RBI releases a statement, analysts often manually compare it to the prior version to detect changes in tone, emphasis, and policy stance. That workflow is slow, qualitative, and biased. People anchor on expectations and can miss small but meaningful shifts. We asked a simple question: what if an AI could quantify policy change instantly—so the primary output isn’t a summary of what was said, but a structured measure of what changed? That idea became Sahasranshu.

What it does

Sahasranshu is a stateless, manifest-driven analysis engine that turns central-bank PDFs into machine-readable “policy deltas.” It: Ingests complex regulatory PDFs while preserving page structure and tables Extracts key policy stances (e.g., inflation, labor market, growth, financial conditions) Compares the current document against a reference version to compute a structured “change vector” across stances (direction + magnitude + confidence) Generates hypotheses for what may have caused the change (e.g., “inflation persistence” or “labor market cooling”) Produces falsifiable predictions for upcoming economic data releases that could confirm or refute the hypothesis The output is not prose it is structured JSON designed for quantitative workflows. How we built it We designed Sahasranshu as a functional pipeline to maximize reproducibility and auditability—critical for finance and compliance.

AI Core (Gemini 3.0)

Large context window: enabled full documents (and supporting history) to be analyzed together, instead of relying on lossy chunking Native JSON mode: enforced strict Pydantic schemas so stance and delta objects can be consumed directly by downstream systems

Backend

Python 3.11 with a custom GeminiClient that implements exponential backoff, rate-limit handling, and deterministic audit logs

Frontend

A high-performance Vanilla JavaScript dashboard inspired by the Bloomberg Terminal We avoided heavy frameworks to keep interaction snappy during high-volatility market events

Delta Engine

A dedicated comparison engine that explicitly reasons about changes in intent (not just wording), then outputs normalized deltas with confidence scores Challenges we ran into Comparative hallucination LLMs are strong at summarizing one document, but comparisons can trigger invented differences.

Solution:

We implemented a manifest-driven comparison protocol that always feeds document pairs together with explicit instructions to ignore stylistic edits and focus only on policy intent.

Latency vs. depth

Deep reasoning improves quality, but market users need speed.

Solution:

We parallelized stages (reference loading and stance extraction run concurrently) and optimized prompts for sub-10-second end-to-end runs. PDF complexity (tables, footnotes, formatting) Policy caveats often live in the hardest-to-parse parts of documents.

Solution:

We leaned on Gemini’s native multimodal PDF understanding to preserve structure and interpret tables/footnotes more reliably than OCR-first approaches.

Accomplishments that I m proud of

Built a complete “delta-first” pipeline end to end: PDF ingestion → stance extraction → change computation → hypothesis + prediction → structured output Achieved reproducible, auditable runs using a stateless, manifest-driven design Enforced strict JSON schemas (Pydantic) for reliability in downstream quantitative systems Delivered a fast, terminal-style UI that makes policy changes explorable in seconds Reduced noisy “summary-first” behavior by making change detection the primary objective

What I learned

Change is the highest-value signal. In time-series text, “what changed” matters more than “what it says.” Schema is strategy. For fintech use cases, structured outputs are the difference between insight and unusable text. Comparisons require specialized prompting and structure; treating delta detection as a first-class task significantly improves accuracy. Native multimodal PDF reasoning is a big advantage for regulatory documents, especially where tables and footnotes matter. What's next for Sahasranshu: Delta-First NLP for Central Bank Intelligence Expand coverage beyond one institution: Fed, ECB, BoE, RBI, and major emerging-market central banks Build a “policy delta index” that tracks stance shifts over time and across countries Add event-driven workflows: automatic ingestion when new statements/minutes/speeches drop Improve evaluation: benchmark delta accuracy against human analyst annotations and historical market reactions Integrate more artifacts: meeting minutes, speeches, Q&A transcripts, and macro data releases for stronger causal attribution Package as an API + enterprise dashboard for funds, research desks, and risk teams

Built With

Share this project:

Updates