About the Project

Inspiration

By 2026, AI tools have gotten better at generating things — some can produce slides, LaTeX documents. But "generate" is where they stop. The output is a one-shot artifact with no structure underneath. You can't ask the AI to revise just the kitchen layout without regenerating the entire plan. There's no version history. No quality check. No way to know if the result is valid against any real standard. You still end up in PowerPoint or AutoCAD, cleaning things up by hand.

We started asking: what if the AI didn't just generate a document — what if it could iterate, validate, and commit it? What if the output had real structure you could edit incrementally, reviewed by an independent agent before it was finalized?

That question led us to Moment — the last design system you'll need.

What We Learned

Clean prompts and clean tool contracts matter more than model choice.

We started with multi-agent separation from day one — that was never the question. The hard part was making each agent reliable. A ReAct loop is only as good as its prompt clarity and tool boundaries. Vague instructions produce vague reasoning. Overloaded tools produce unpredictable calls. We learned to treat every system prompt and every tool definition as a contract — precise, minimal, testable.

The artifact lifecycle: Create → Draft → Audit → Commit. Drafts update incrementally (not full regeneration). Commits are immutable. For the GenAI closed loop, we pass the committed wireframe to Gemini's image model, producing a photorealistic interior visualization — one pipeline from natural language to lifestyle image.

Challenges We Faced

Multi-agent prompt stability is a real engineering problem. When you have four agents with different roles, each prompt has to be precise enough to keep the agent on task — but flexible enough to handle diverse user intent. Small wording changes in one agent's system prompt would cascade into unexpected behavior downstream. We spent more time stabilizing prompt interactions across the agent graph than writing any single feature.

End-to-end evaluation had to be automated — or it didn't work. You can't manually QA a multi-agent system. Every change to a prompt, a schema, or a rendering rule could break something three steps later. We built an automated evaluation pipeline that runs real scenarios end-to-end and uses LLM-as-judge to score the output. Without it, we were flying blind — with it, we could iterate with confidence.

The constant temptation to over-design. At every layer — agent routing, schema structure, rendering logic — the easy path was to add a special case, hardcode an assumption, or build an abstraction "just in case." We learned to resist. Every hardcoded shortcut became a liability within a week. The system only stayed coherent because we kept asking: is this the minimal design that solves the actual problem? Minimalism isn't aesthetic here — it's a survival strategy.

How It's Different

| What most AI tools do (2026) | What Moment does | | Single model, one-shot generation | Four agents with separation of duties | | Output is Markdown or a flat image | Structured, schema-validated artifact | | "Regenerate" replaces everything | Incremental edits — change only what changed | | No quality gate — you are the QA | Independent Auditor reviews before commit | | Full chat history as context (bloated) | Isolated context per agent (lean) | | Outputs live in the chat thread | Versioned, committed, downloadable documents |

We're not replacing designers. We're giving everyone a design-literate AI team — one that produces professional documents, not drafts you have to clean up.

Built With

Share this project:

Updates