Inspiration
Every indie dev has the same ghost story: a wildly ambitious first game — "an open-world RPG with multiplayer, crafting, and a big story!" — that dies in a folder six months later because the scope was never within a thousand miles of their real free time. The gap between the dream and the calendar is invisible until it's too late. We wanted to make that gap visible on day one, then do the kinder, harder thing: hand the dreamer a version they can actually ship.
The brief sharpened it — it warns against the "glorified task generator" trap and asks: why an LLM and not a rules engine? That question shaped every decision.
What it does
You give Ship It your dream game in plain English plus your real capacity (people × hours/week × weeks). It returns:
- A reality check — e.g. "needs ~7,300 hours; you have 520 — about 14× over, roughly 7 years at your pace."
- The scope-killers — the systems quietly eating your timeline, and why.
- A shippable vertical slice + a sequenced milestone roadmap + a Day-1 task.
- A scale-up plan instead, if your dream actually fits. Honest both ways.
It also reasons about what really moves the number — your AI tooling, your genre, and your experience — all as live "what-if" controls.
How we built it
The core is a hybrid: two Claude calls wrapping a deterministic engine.
- Extract — Claude turns messy prose into a strict, Zod-validated
Scope(structured outputs — it can't invent a system the engine doesn't know). - Engine — pure TypeScript does every number: effort = systems + content, × risk multipliers, × a leverage-weighted AI-tooling discount, vs capacity. Same input → same output, always. Covered by 70 deterministic tests.
- Narrate — Claude turns the cold numbers into a warm coaching message, forbidden from inventing or changing a single figure.
This is our answer to "why AI?": the LLM is essential to understand a free-form idea and explain a result like a mentor — but the trustworthy numbers come from code that can't hallucinate. No RAG; the knowledge base is small and structured.
Validating the numbers (what we're proudest of)
"Are these hours even real?" So we checked them against shipped games. Stardew Valley took its solo dev ~16,000 hours over 4.5 years — Ship It estimated ~16,600, within ~4%. The validation even caught a real miscalibration (puzzle levels priced like action levels), which we fixed. The numbers are pressure-tested.
Challenges we ran into
- Run-to-run drift on the same pitch — fixed with
temperature: 0and moving the experience taxes out of the LLM into user toggles. - A crash from the model emitting
count: 0— the kind of edge only live testing surfaces. - The hardest part was keeping the LLM entirely out of the math while still feeling intelligent.
What we learned
- The hardest part of an AI app is deciding what the AI should NOT do.
- Live testing with weird inputs beats unit tests for the bugs that matter.
- Validation isn't just reassurance — it found a real flaw.
What's next
- More genres + data-calibrated effort numbers from real postmortems.
- Few-shot extraction for even tighter consistency.
- Save, share, and track a plan against its roadmap over time.
Built With
- anthropic-claude
- next.js
- node.js
- react
- render
- tailwindcss
- typescript
- zod
Log in or sign up for Devpost to join the conversation.