Live Demo: https://idea-forge-web.web.app/
Inspiration
Every founder has ideas. Almost none have a first step.
We kept watching the same pattern — friends, classmates, hackathon teammates — write exciting ideas in Notion, pitch them in group chats, feel the rush of possibility… and then do nothing. The gap was never ideation. It was execution. A vague idea like "I want to build a tutoring marketplace for college students" has no specificity to act on, no hidden assumptions surfaced, no risks mapped, and no 24-hour action plan. So 90 days get wasted building something nobody wanted.
We asked: what if an AI could do what a good cofounder does in a whiteboard session — challenge your assumptions, map your risks, design milestones, and hand you one real step you can take today — but structured, rigorous, and honest?
That's IdeaForge.
What it does
IdeaForge is not a chatbot. It's an execution intelligence system — a forcing function that stress-tests your vague idea against cold reality in under 10 minutes.
You type any idea (a startup, a hackathon project, a side business). IdeaForge runs it through five sequential AI reasoning phases, each applying a distinct decision framework:
- Clarify — Strips buzzwords. Compresses your idea to one precise sentence. Uses Jobs-to-be-Done and the Mom Test to identify the functional, emotional, and social jobs your idea must serve.
- Assume — Surfaces the 3 hidden assumptions that will kill the idea. Uses Gary Klein's Pre-Mortem technique and Kahneman's System 1/2 framework to catch cognitive blind spots.
- Risk — Maps 3 execution risks with impact×likelihood scores and concrete mitigations. One must be a unit-economics check. One must be a Black Swan event.
- Milestones — Designs 3 milestones where each produces a learning, not just a deliverable. Uses Lean Startup's Build-Measure-Learn cycle and BJ Fogg's Behavior Model.
- First Step — Stamps one 24-hour action with a ready-to-paste artifact (a literal cold email, landing page headline, or test script you copy and execute immediately). Includes a "Human must decide" checkpoint — a decision the AI explicitly refuses to make.
The results build live on an Execution Canvas — a structured dashboard on the right side of the screen with confidence meters, severity pills, risk grids, milestone cards, the first step, and the human-decision flag. Everything streams in real time.
How we built it
Backend: Python 3.11 + FastAPI + SQLite
The server is a FastAPI application with four route modules. The API key never reaches the browser — all AI calls happen server-side. We used SQLAlchemy 2.0 ORM with 4 tables to manage session persistence across page reloads.
AI Pipeline: 5 Sequential Phases
Each phase calls NVIDIA NIM (z-ai/glm-5.1) via the OpenAI Python SDK. Each phase has its own temperature (0.5–0.65), token budget, JSON schema, and cognitive framework. Crucially, each phase sees the cumulative output of all prior phases via a canvas_summary() function.
Streaming & JSON Reliability
We used Server-Sent Events (SSE) via fetch + ReadableStream to stream tokens to the UI. Because LLMs don't always return clean JSON, we built a custom _ThinkParser state machine to split Nemotron inline thinking tags from the final payload, backed by a one-shot JSON enforcement retry loop that achieves a ~95% success rate.
Responsible AI Guardrails
We baked in a Computed Confidence score (math-driven, not AI hallucinated), an Adversarial Tone rule to ban cheerleading, and an isolated Keyword Guardrail that scans raw inputs to block illegal ToS violations, scraping, and unconsented surveillance.
Challenges we ran into
Squashing bugs and hitting API limits: We faced a massive amount of bugs during development, specifically regarding LLM state management, JSON parsing, and handling mid-stream pivot requests (the "Frankenstein Bug").
Our biggest hurdle, however, was the free-tier API limits. We repeatedly exhausted our free limits on the z-ai/glm-5.1 endpoint, which would cause the app to silently stall or drop requests. To survive this, we had to engineer a highly complex independent per-feature key rotation system utilizing multiple accounts. We built an idle-timeout watchdog that catches silent stalls and automatically rotates the API key in the backend without the user ever noticing.
Accomplishments that we're proud of
Making a stunning, professional web app: We are incredibly proud of the UI/UX. We did not use React or heavy frontend frameworks; we wrote ~1200 lines of Vanilla JS and ~400 lines of CSS.
Despite that, we built a flawless "Dark Industrial Forge" aesthetic. It features fluid dual-panel layouts, a collapsible reasoning drawer with a green-phosphor scanline effect, interactive traffic-light confidence meters, and an immensely satisfying full-screen "FORGED" stamp animation complete with CSS spark particles and sound effects. Achieving this level of studio-grade polish from scratch is a massive win for our team.
What we learned
We learned how to use a massive stack of new tools and gained invaluable full-stack engineering experience. Building this taught us:
- Structured Output > Chat: A phase-based architecture with strict JSON schemas produces dramatically better output than a generic chat prompt.
- Complex Streaming: We learned how to properly handle Server-Sent Events (SSE) and parse chunked data streams in real-time.
- UX Principles: We learned that "Suggest-then-approve" (forcing the user to click apply on a pivot) builds way more trust than an AI silently mutating data.
Overall, this project pushed us to our limits, and we learned an incredible amount about building resilient AI systems.
What's next for IdeaForge
Scale, Open-Source, and BYOK. Our immediate next step is to open-source this entire project so the builder and hackathon community can use it, study it, and contribute to it. To solve the free-tier API bottlenecks and allow the platform to scale indefinitely, our next major feature will be "Bring Your Own Key" (BYOK). This will enable users to plug in their own NVIDIA, OpenAI, or Anthropic API keys to run the forge seamlessly on their own infrastructure.
Built With
- abortcontroller
- css3-(custom-properties
- fastapi
- firebase-firestore-10.12-(anonymous-cross-device-sync)
- flexbox
- google-fonts-(oswald
- grid
- html5
- inter
- jetbrains-mono)
- keyframe-animations)
- nvidia-nim-(z-ai/glm-5.1-via-openai-compatible-api)
- openai-python-sdk-(asyncopenai-streaming)
- pydantic-2
- pydantic-settings
- python-3.11
- python-dotenv
- readablestream
- render
- server-sent-events)
- sqlalchemy-2.0
- sqlite
- uvicorn
- vanilla-javascript-(fetch-api
Log in or sign up for Devpost to join the conversation.