Inspiration (Problem)
AI can write code, but teams can’t trust why it was written. Specs drift, tests fail, and there’s no audit trail from real customer evidence to shipped code.
What it does (Solution)
Growpad runs a stateful, multi-step workflow: Evidence → Insights → Decision Memo → PRD → Tickets → Code Diff → Tests → Drift Check → Export (zip) If verification fails, it performs bounded self-healing (≤2 retries) and re-verifies.
How it works (What judges should understand in 20 seconds)
Panel 1: Evidence + guardrails (OKR, forbidden paths, max retries=2, failure injection toggle) Panel 2: Pipeline log (each stage + retry counter + pass/fail) Panel 3: Artifact tabs (Evidence Map, Decision Memo, PRD, Tickets, Diff, Tests, Drift Report, Scorecard) + Download ZIP
How we built it with Gemini 3
How we built it (Gemini 3 API "magic")
Thought signatures to keep state across stages Function calling to orchestrate tools (repo scan, diff, tests, export) Structured outputs to generate deterministic JSON artifacts Long context to ingest large evidence packs Code execution to compute verification metrics and validate outputs
Challenges we ran into
Making failure deterministic, retries bounded, and outputs judge-readable—without turning this into a prompt wrapper.
Accomplishments we're proud of
- A full "Evidence → Verified PR" loop with visible logs and a downloadable proof pack.
- An intentional failure that gets corrected in ≤1 retry (cap is 2).
- A clean demo UX that clearly communicates tool use + verification.
What's next
Connectors (Intercom/Gong/Mixpanel), human approval gates, continuous nightly synthesis.

Log in or sign up for Devpost to join the conversation.