Inspiration

Early-stage incubators and VCs face a massive screening bottleneck. Programs like Techstars and Y Combinator receive tens of thousands of applications annually. Even reviewing 1,000 startups at just 30 minutes each costs:

$$ 1000 \times 0.5 = 500 \text{ analyst hours} $$

That’s weeks of manual, cognitively demanding evaluation — often subjective and inconsistent.

Human investors don’t make decisions linearly. They:

  • Form hypotheses
  • Challenge assumptions
  • Debate trade-offs
  • Weigh uncertainty

So we asked:

What if AI could simulate an actual investment committee — not just answer prompts?

That question led to JudgeVC.


What it does

JudgeVC is an AI Investment Committee.

It automates top-of-funnel startup screening by simulating structured debate between specialized agents.

Instead of a single LLM producing a yes/no answer, JudgeVC:

  • Analyzes market opportunity
  • Evaluates founder strength
  • Assesses technical defensibility
  • Runs financial projections
  • Conducts pro vs. contra debate
  • Assigns Tier 1 / Tier 2 / Tier 3 with confidence scores

Users upload an Excel sheet of startup applications. JudgeVC returns a tiered Excel output with structured reasoning.

It compresses weeks of screening into minutes — without sacrificing transparency.


How we built it

JudgeVC is a multi-agent architecture orchestrated by NVIDIA Nemotron.

System Design

  • Nemotron (Orchestrator & Market Agent) Coordinates the workflow, activates agents, calls tools, and synthesizes the final decision.

  • Claude (Team & Counter-Thesis Agent) Evaluates founder quality and generates structured counter-arguments.

  • GPT-4 (Technical Risk & Judge Agent) Assesses defensibility and scores debate outputs.

  • Financial Projection Tool (Python Function) Deterministic valuation estimator:

$$ \text{Projected Revenue} = \text{Market Size} \times \text{Capture Rate} $$

$$ \text{Valuation} = \text{Projected Revenue} \times \text{Industry Multiple} $$

Nemotron performs multi-step reasoning, structured tool-calling, and agent coordination.

This is not simple prompt chaining. It is agentic automation.

Example of our tool-calling structure:

def financial_projection(market_size, capture_rate, multiple):
    revenue = market_size * capture_rate
    valuation = revenue * multiple
    return {"revenue": revenue, "valuation": valuation}

Challenges we ran into

Model Disagreement

Different models often produced conflicting evaluations. We implemented a structured debate layer and calibrated scoring thresholds to stabilize classifications.


Stochastic Variability

LLMs are probabilistic. Small output variations led to inconsistent tiering. We reduced volatility by integrating deterministic financial tools and confidence banding.


Tool Integration

Ensuring Nemotron could reliably call Python functions and integrate outputs into downstream reasoning required careful function schema design and structured prompting.


Human-Level Alignment

Matching real investor intuition was non-trivial. We tuned evaluation thresholds and debate scoring logic to better approximate real screening dynamics.

Automating judgment under uncertainty was our hardest engineering challenge.


Accomplishments that we're proud of

  • Built a fully functioning multi-agent evaluation pipeline
  • Successfully integrated NVIDIA Nemotron as orchestration brain
  • Implemented real tool-calling within reasoning loops
  • Created structured debate instead of single-opinion outputs
  • Automated Excel → Tiered Excel workflow end-to-end
  • Delivered explainable decisions rather than black-box outputs

We transformed LLMs from chat tools into structured decision infrastructure.


What we learned

  • Multi-agent reasoning outperforms single-prompt workflows for complex decisions.
  • Debate structures reduce bias and improve interpretability.
  • Deterministic tools are critical for stabilizing AI systems.
  • Orchestration matters more than raw model size.

We learned that early-stage investment decisions are fundamentally dialectical — not linear.


What's next for JudgeVC

  • Direct integration with applicant portals
  • Voice-based founder pitch ingestion
  • Retrieval-Augmented Generation (RAG) with live market datasets
  • Continuous learning from real investment outcomes
  • Customizable scoring frameworks per incubator thesis

Our long-term goal is to build decision infrastructure for early-stage ecosystems — scalable, explainable, and agentic by design.


JudgeVC is not a chatbot. It is automated venture screening powered by multi-agent AI.

Built With

  • 3.5
  • anthropic
  • asyncio
  • css
  • fastapi
  • framer
  • lovable
  • motion
  • ollama
  • openai
  • openpyxl
  • openroute
  • pandas
  • parallel
  • pydantic
  • python
  • react
  • router
  • shadcn/ui
  • sse
  • tailwind
  • typescript
  • vercel
  • vite
Share this project:

Updates