Inspiration

We noticed people including ourselves commit to ideas before testing if they actually hold up. IdeaForge forces that check before the time gets spent, not after.

What it does

IdeaForge is a conversational AI tool that turns a vague idea into a tested decision. Many students, first-time founders, and career-switchers face a decision they cannot fully see the shape of, and tend to start building before testing whether the idea actually holds up. Instead of generating a plan immediately, IdeaForge surfaces the hidden assumptions underneath an idea, maps the real risks, and forces an explicit human decision (BUILD, PIVOT, or DROP) before any execution plan is generated. It stops users from building three weeks of work on an untested guess.

How we built it

IdeaForge is a single-file HTML/CSS/JavaScript web app calling the Google Gemini API (gemini-2.5-flash) directly from the browser. The core is a stage-gated prompting pipeline --) idea capture, clarification, assumption surfacing, risk analysis, and a reference-profile comparison, where each stage returns structured JSON rather than free text, so every assumption and risk ties to a named pattern and an explicit consequence. A separate adversarial pass ("Challenge My Idea") re-examines the generated plan to find its single weakest assumption.

Challenges we ran into

Our biggest challenges were backend reliability, not the reasoning logic itself. We went through several API providers before landing on Gemini --) OpenRouter, then NVIDIA NIM (unavailable in our region), then Groq, before settling on direct Gemini API calls. We also hit a real bug where long plan-generation responses got silently truncated by the token limit, leaving broken JSON in the chat we fixed this by raising the output token budget and turning truncation into a visible, retryable error instead of a silent failure.

Accomplishments that we're proud of

The two human-in-the-loop checkpoints an early proactive flag for weak ideas, and a hard decision gate before any plan generates aren't just described in our write-up, they're structurally enforced in the code. We're also proud that the system stays honest even after a decision is locked in: in testing, after choosing BUILD, it still surfaced "Market Risk: High" instead of retroactively agreeing with the user's choice.

What we learned

Building this taught us that a reasoning tool is only as trustworthy as its failure handling the token-truncation bug we hit showed that a single silent failure does more damage to user trust than a wrong answer would, since it breaks the exact honesty the tool is supposed to represent. We also learned that constraining the AI making it stop and ask instead of always deciding was what actually made it feel trustworthy, not a limitation on what it could do.

What's next for Idea Forge

Turning the validation threshold into a real feedback loop logging actual interview outcomes against the threshold the AI proposed, so the system's calibration improves against real user data instead of just its own training knowledge.

Share this project:

Updates