💡 Inspiration

Modern Private Equity (PE) due diligence is highly manual, consumes hundreds of analyst hours, and is vulnerable to human cognitive bias or cherry-picked metrics. While Large Language Models (LLMs) have made strides, they suffer from three fatal flaws when applied to finance:

  1. Hallucinated Math: Standard LLMs cannot calculate complex financial logic (like DCF projections or debt schedules) with decimal-point accuracy.
  2. Lack of Sector Context: Standard models fail to recognize that valuing a bank requires a Price-to-Book and Return-on-Equity model, while valuing an industrial requires an EV/EBITDA model.
  3. Absence of Adversarial Stress-Testing: Analysts often suffer from confirmation bias, looking only for data that supports their initial thesis.

We built Project Veritas to solve this. It is an institutional-grade, multi-agent AI engine that automates the PE due diligence pipeline—shrinking weeks of research into a few minutes of self-correcting, mathematically rigorous, and adversarial analysis.


🔍 What it does

Project Veritas acts as an automated, full-stack PE Investment Committee. The system ingests a target company's ticker (e.g., PG, AXP) along with optional PDF documents (like annual reports or filings), and runs a highly structured, sequential multi-agent execution pipeline:

  • RAG over Financial Theory: Queries a vector database (ChromaDB + BGE-M3 embeddings) populated with 40+ core finance textbooks to extract optimal audit checklists and industry-specific due diligence frameworks.
  • Live Extraction & Benchmarking: Pulls real-time market data via yfinance and discovers global peers, then parses peer metrics and precedent transaction databases exported from S&P Capital IQ.
  • Financial Forensics: Computes an earnings quality audit by executing the 8 Schlitt forensic tests, adjusting reported EBITDA, and scoring credit/leverage safety.
  • Adversarial Committee Debate: Orchestrates a 2-round structured debate between a Deal Champion (Bull) defending the upside and a Risk Partner (Bear) exposing downside risks.
  • Investment Chair Decision: Synthesizes the debate, determines a final verdict (APPROVE / HOLD / REJECT), and designs a dynamic entry strategy.
  • Executive Presentation: Renders a board-ready Streamlit dashboard with interactive peer comparables, sensitivity tables, and a downloadable professional HTML investment memo.

🛠️ How we built it

The backbone of Project Veritas is a modular, multi-agent pipeline governed by our custom Mailbox Protocol—a sequential data-passing model that propagates a single state object, deal_context, between seven specialized agents.

The Technical Stack:

  • LLM & Inference: Llama 3.3 70B hosted on AMD Instinct™ GPU infrastructure via Fireworks AI and NVIDIA NIM.
  • Vector Database: ChromaDB with BGE-M3 hybrid embeddings for low-latency methodology retrieval.
  • Calculations: Standardized Python mathematical modules. To prevent LLM math hallucinations, all formulas are coded in pure Python, utilizing the LLM only to parameterize the assumptions.
  • Data Pipelines: Integrates yfinance API, S&P Capital IQ spreadsheets, and Tavily search fallback restricted to verified financial authorities (e.g., SEC, Damodaran, BSE).
  • Frontend: A custom glassmorphism Streamlit UI displaying real-time agent workflow logs and exporting clean HTML.

Mathematical Rigor (LaTeX):

We implement a deterministic Cost of Equity (CAPM) lookup using Aswath Damodaran's (NYU Stern) industry risk datasets: $$CoE = R_f + (\beta \times ERP) + CRP$$

Where $R_f$ is the US 10Y Treasury yield ($4.3\%$), $ERP$ is the Equity Risk Premium ($5.5\%$), and $CRP$ is the Country Risk Premium.

Our DCF engine calculates Enterprise Value using a McKinsey Value Driver model for terminal value: $$EV = \sum_{t=1}^{5} \frac{FCF_t}{(1 + WACC)^t} + \frac{Terminal\ Value}{(1 + WACC)^5}$$

$$Terminal\ Value = \frac{NOPLAT_5 \times (1 - \frac{g}{RONIC})}{WACC - g}$$


⚡ Challenges we ran into

  • Hallucinated Metrics: Early runs resulted in agents generating mismatched numbers for valuation comps. We solved this by implementing strict validation schemas (Pydantic models) and separating the mathematical calculations from the text-generation models.
  • Cost of Equity Drifts: Small drifts in beta resulted in wide valuation spreads. We built a deterministic sector-matching function that links yfinance industry strings directly to static, cached Damodaran lookup files.
  • API Rate Limits during Agent Chain Calls: Parallel API execution frequently blocked. We hardened our inference wrapper with jittered exponential backoffs, automatic model failovers, and heavy caching of peer data in local memory.

🏆 Accomplishments that we're proud of

  • Dual-Valuation Switch: The pipeline automatically detects the company's sector and dynamically switches the entire valuation logic (e.g., using Price/Book and ROE for banks like American Express, and EV/EBITDA for consumer companies like P&G).
  • Forensic Safety Interlocking: If the Financial Forensics Agent flags accounting anomalies or low Quality of Earnings, the Orchestrator is programmed to automatically override positive mathematical upside with a REJECT or HOLD verdict.
  • Zero Math Hallucination: 100% of the numbers displayed in the peer comps, football fields, and sensitivity grids are mathematically verified, meaning the final PDF is ready to be handed to an active Chief Investment Officer.

📖 What we learned

  • Multi-agent communication is highly unstable when left unstructured. Implementing a strict, sequential state-passing system (the Mailbox Protocol) is far superior to standard conversational chat models for professional business intelligence.
  • AI cannot replace human analysts in math, but it excels at synthesis, debate, and hypothesis stress-testing when paired with deterministic code execution.

🚀 What's next for Project Veritas

We plan to scale Project Veritas to:

  • Support automated private company scraping by connecting OCR parsers to uploaded PDF pitch decks.
  • Expand the CapIQ data connector to automatically fetch live capital market databases.
  • Deploy on localized enterprise hardware utilizing fine-tuned small-language models (SLMs) for proprietary, firewalled PE funds.

Built With

Share this project:

Updates