Project Veritas

💡 Inspiration

Modern Private Equity (PE) due diligence is highly manual, consumes hundreds of analyst hours, and is vulnerable to human cognitive bias or cherry-picked metrics. While Large Language Models (LLMs) have made strides, they suffer from three fatal flaws when applied to finance:

Hallucinated Math: Standard LLMs cannot calculate complex financial logic (like DCF projections or debt schedules) with decimal-point accuracy.
Lack of Sector Context: Standard models fail to recognize that valuing a bank requires a Price-to-Book and Return-on-Equity model, while valuing an industrial requires an EV/EBITDA model.
Absence of Adversarial Stress-Testing: Analysts often suffer from confirmation bias, looking only for data that supports their initial thesis.

We built Project Veritas to solve this. It is an institutional-grade, multi-agent AI engine that automates the PE due diligence pipeline—shrinking weeks of research into a few minutes of self-correcting, mathematically rigorous, and adversarial analysis.

🔍 What it does

Project Veritas acts as an automated, full-stack PE Investment Committee. The system ingests a target company's ticker (e.g., PG, AXP) along with optional PDF documents (like annual reports or filings), and runs a highly structured, sequential multi-agent execution pipeline:

RAG over Financial Theory: Queries a vector database (ChromaDB + BGE-M3 embeddings) populated with 40+ core finance textbooks to extract optimal audit checklists and industry-specific due diligence frameworks.
Live Extraction & Benchmarking: Pulls real-time market data via yfinance and discovers global peers, then parses peer metrics and precedent transaction databases exported from S&P Capital IQ.
Financial Forensics: Computes an earnings quality audit by executing the 8 Schlitt forensic tests, adjusting reported EBITDA, and scoring credit/leverage safety.
Adversarial Committee Debate: Orchestrates a 2-round structured debate between a Deal Champion (Bull) defending the upside and a Risk Partner (Bear) exposing downside risks.
Investment Chair Decision: Synthesizes the debate, determines a final verdict (APPROVE / HOLD / REJECT), and designs a dynamic entry strategy.
Executive Presentation: Renders a board-ready Streamlit dashboard with interactive peer comparables, sensitivity tables, and a downloadable professional HTML investment memo.

🛠️ How we built it

The backbone of Project Veritas is a modular, multi-agent pipeline governed by our custom Mailbox Protocol—a sequential data-passing model that propagates a single state object, deal_context, between seven specialized agents.

The Technical Stack:

LLM & Inference: Llama 3.3 70B hosted on AMD Instinct™ GPU infrastructure via Fireworks AI and NVIDIA NIM.
Vector Database: ChromaDB with BGE-M3 hybrid embeddings for low-latency methodology retrieval.
Calculations: Standardized Python mathematical modules. To prevent LLM math hallucinations, all formulas are coded in pure Python, utilizing the LLM only to parameterize the assumptions.
Data Pipelines: Integrates yfinance API, S&P Capital IQ spreadsheets, and Tavily search fallback restricted to verified financial authorities (e.g., SEC, Damodaran, BSE).
Frontend: A custom glassmorphism Streamlit UI displaying real-time agent workflow logs and exporting clean HTML.

Mathematical Rigor (LaTeX):

We implement a deterministic Cost of Equity (CAPM) lookup using Aswath Damodaran's (NYU Stern) industry risk datasets: $$CoE = R_f + (\beta \times ERP) + CRP$$

Where $R_f$ is the US 10Y Treasury yield ($4.3\%$), $ERP$ is the Equity Risk Premium ($5.5\%$), and $CRP$ is the Country Risk Premium.

Our DCF engine calculates Enterprise Value using a McKinsey Value Driver model for terminal value: $$EV = \sum_{t=1}^{5} \frac{FCF_t}{(1 + WACC)^t} + \frac{Terminal\ Value}{(1 + WACC)^5}$$

$$Terminal\ Value = \frac{NOPLAT_5 \times (1 - \frac{g}{RONIC})}{WACC - g}$$

⚡ Challenges we ran into

Hallucinated Metrics: Early runs resulted in agents generating mismatched numbers for valuation comps. We solved this by implementing strict validation schemas (Pydantic models) and separating the mathematical calculations from the text-generation models.
Cost of Equity Drifts: Small drifts in beta resulted in wide valuation spreads. We built a deterministic sector-matching function that links yfinance industry strings directly to static, cached Damodaran lookup files.
API Rate Limits during Agent Chain Calls: Parallel API execution frequently blocked. We hardened our inference wrapper with jittered exponential backoffs, automatic model failovers, and heavy caching of peer data in local memory.

🏆 Accomplishments that we're proud of

Dual-Valuation Switch: The pipeline automatically detects the company's sector and dynamically switches the entire valuation logic (e.g., using Price/Book and ROE for banks like American Express, and EV/EBITDA for consumer companies like P&G).
Forensic Safety Interlocking: If the Financial Forensics Agent flags accounting anomalies or low Quality of Earnings, the Orchestrator is programmed to automatically override positive mathematical upside with a REJECT or HOLD verdict.
Zero Math Hallucination: 100% of the numbers displayed in the peer comps, football fields, and sensitivity grids are mathematically verified, meaning the final PDF is ready to be handed to an active Chief Investment Officer.

📖 What we learned

Multi-agent communication is highly unstable when left unstructured. Implementing a strict, sequential state-passing system (the Mailbox Protocol) is far superior to standard conversational chat models for professional business intelligence.
AI cannot replace human analysts in math, but it excels at synthesis, debate, and hypothesis stress-testing when paired with deterministic code execution.

🚀 What's next for Project Veritas

We plan to scale Project Veritas to:

Support automated private company scraping by connecting OCR parsers to uploaded PDF pitch decks.
Expand the CapIQ data connector to automatically fetch live capital market databases.
Deploy on localized enterprise hardware utilizing fine-tuned small-language models (SLMs) for proprietary, firewalled PE funds.

Built With

amd
bge-m3
chromadb
llama
numpy
pandas
python
pytorch
streamlit
tavily

Updates

Moosa Talha Al Kaseri started this project — May 18, 2026 01:05 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.