The Problem Most AI tools make founders feel good about bad ideas. They generate polished roadmaps, optimistic task lists, and structured plans — all of which assume the idea already works. Nobody challenges the assumptions underneath. Nobody asks what happens if the foundational beliefs are wrong. The result: people invest weeks of work into ideas that were never viable, and only find out when it's too late. The real problem isn't execution. It's assumption blindness.

What Forge Does Forge is an AI-powered idea stress-tester. You submit a raw idea. Forge tears it apart, makes you defend it piece by piece, and only builds your execution plan from what survives. The core insight is simple: every idea is a tree of assumptions, and assumptions have a kill order. Some are foundational — if they fail, everything above them is worthless. Most tools ignore this structure entirely. Forge makes it the entire product. You don't get a roadmap for free. You earn it.

How We Built It Forge runs on a structured AI pipeline with a clean separation between what the AI does and what the human decides. Step 1 — Classify and Pressure-Test

The user submits one sentence. Forge classifies the idea type and generates four domain-specific skeptical personas — not generic archetypes, but stakeholders calibrated to the actual idea. A fintech startup gets a unit economics investor and a procurement skeptic. A social initiative gets a community organizer and a sustainability analyst. The personas are idea-aware because a rules engine producing four fixed roles for every idea would be useless. Step 2 — Build the Assumption Graph

Forge extracts 5–8 falsifiable assumptions and arranges them as a dependency graph — a DAG, not a flat list. Foundational assumptions sit at the bottom. Dependent assumptions lock above them until their parents are resolved. The Roadmap node sits at the very top, unlocking only when everything beneath it reaches a terminal state. This structure matters: it makes the kill order visible and forces the founder to address root causes before surface-level execution. Step 3 — Defend Your Assumptions

The user clicks any active node. A chat panel opens with the assigned persona. Three exchanges. The persona is skeptical, specific, and in character. Convinced → node goes green, children unlock. Not convinced after three exchanges → node goes red, all descendants collapse. No partial states. No AI-assigned scores. No automated verdicts. The founder either makes the case or they don't. Step 4 — Earn the Roadmap

Once all reachable nodes are resolved, the Roadmap node activates. Clicking it generates an execution plan built exclusively from green assumptions. Red nodes appear explicitly as risks — never hidden, never papered over. If too many foundational assumptions failed, Forge returns a pivot recommendation instead of a false plan. Interface: A live 3D rotating assumption tree built with react-force-graph-3d. Node colors reflect real-time state — gray for pending, pulsing blue for active, green glow for defended, red for rejected, gold for the Roadmap. The graph is the UI. It shows you exactly where your idea stands at every moment.

Why This Needs AI The same system handles a B2B SaaS, a public health initiative, a class project, and a research paper — with different personas, different assumptions, different dependency structures. A rules engine can't do that. It produces the same four generic roles and the same generic checklist regardless of what the idea actually is. LLMs let Forge understand the specific idea, construct the right pressure, and simulate genuine stakeholder skepticism. That's not a buzzword justification — it's the only way this works. What AI does not do: decide whether an assumption is validated. That is always the human's call. The AI structures the fight. The founder has to win it.

Responsible AI Risk: Users treat a green graph as proof their idea works, rather than as a decision input. Mitigation: Full chat history is preserved and visible for every node. Mentors, teammates, and judges can audit exactly how each assumption was resolved. Outputs are framed as decision inputs throughout — never as correct answers. Red nodes are never hidden.

Challenges The hardest decision was removing automated verdicts. Our first design had AI personas automatically score each assumption. It felt efficient. It was wrong — it just produced false confidence with extra steps. Removing it and making resolution emerge from genuine conversation made the tool harder to use and more honest. That was the right tradeoff. The second challenge was the graph structure. A flat list of assumptions is easy to build and easy to ignore. A dependency graph with real kill propagation forces the founder to care about order and causality. Getting the LLM to produce a real DAG with meaningful branching — not a disguised linear chain — required careful prompt engineering.

What We Learned The most valuable AI systems are not the ones that generate the best answers. They're the ones that force better questions. Forge doesn't tell you your idea is good or bad. It tells you exactly what it depends on — and makes you prove it, one assumption at a time. Progress doesn't begin with certainty. It begins with identifying the next assumption worth testing.

Share this project:

Updates