Inspiration
Existing AI cloud architecture generators have two major flaws:
- They produce nearly identical architectures regardless of the project, suggesting they rely on generic templates rather than analyzing the problem.
- They don't allow iterative refinement; if a requirement changes, users must regenerate the entire design from scratch.
In real-world architecture design, iteration is essential.
This led us to build ArchiFlow, an AI-powered AWS architect that treats every project as unique and allows users to refine architecture designs through natural language change requests.
What It Does
ArchiFlow is an AI cloud architect that designs production-ready AWS architectures from plain English or voice descriptions.
Example prompt:
“Design a multi-tenant SaaS analytics platform handling 10k concurrent users with real-time dashboards.”
ArchiFlow will:
- Research best practices for the specific problem
- Select the most appropriate AWS services
- Design a scalable and fault-tolerant architecture
- Generate a visual architecture diagram
- Estimate infrastructure cost
- Surface assumptions and tradeoffs
Most importantly, users can refine the architecture without restarting the process.
Example change request:
“Replace EC2 with serverless services where possible.”
ArchiFlow dynamically resumes the architecture pipeline from the appropriate step, preserving prior decisions.
Why This Matters
Cloud architecture design is complex and often requires multiple iterations between engineers, architects, and stakeholders.
ArchiFlow reduces this friction by allowing teams to collaborate with an AI architect that supports iterative design, making architecture exploration faster and more accessible.
How We Built It
ArchiFlow is built on LangGraph, a low-level orchestration framework that maintains shared state across agents (nodes).
Voice Input via Nova Sonic Nova Sonic handles voice intake for project descriptions and change requests. It generates a spoken response summarizing its understanding, mirroring the project description displayed at the top of the UI, so users can verify ArchiFlow has understood them correctly before the graph runs. To keep the experience snappy, Nova Sonic is constrained to a single response and interrupted mid-sentence if the reply grows unnecessarily long.
Heterogeneous Nova Model Architecture Each Nova model is mapped to agents by task complexity, a deliberate design choice to maximize accuracy while minimizing cost and resource usage:
| Model | Complexity | Responsibilities |
|---|---|---|
| Nova Micro | Low | Summarization, node routing |
| Nova Lite | Medium | Search query generation, structured output formatting |
| Nova Pro | High | Extracting key content, deep research, architecture design |
Using one large model for everything would produce equivalent output at significantly higher cost and latency. The right tool for each step is the more principled approach.
Single Responsibility Principle (SRP) Each agent is responsible for exactly one concrete task. This reduces hallucination, makes each node easier to debug, and keeps outputs predictable for downstream agents that depend on them.
Critic Pattern For high-complexity nodes, a lightweight reviewer function catches and corrects structural issues without re-running the graph from scratch. The Research Analyst, Solution Architect, and Cost Estimator nodes use Nova Lite-powered reviewers to enforce a structured output format. The Designer node uses one to strip extraneous text from Mermaid diagrams so they render correctly.
Structured Inter-Agent Communication Agents communicate via JSON rather than plain text, ensuring no key detail is lost between nodes. Getting consistent JSON from a model already handling deep research was the core challenge the Critic Pattern solves.
Challenges We Ran Into
1. Nova Act state misinterpretation When tasked with browsing for AI engineering books, Nova Act would get stuck on CAPTCHA-protected sites and retry indefinitely. Rather than patching around it with a Circuit Breaker node, we addressed the underlying issue by implementing a ReAct (Reason + Act + Observe) pattern, introducing a dedicated node that directs Nova Act on what to do next. Full write-up: Improve Nova Act Accuracy - The Nova Lite reasons, and the Nova Act acts
2. Implementing the Change Request feature The core challenge was dynamically selecting the next node. A conditional LangGraph node uses a router method powered by Nova Lite to decide which node to resume from. Since the graph state is fully preserved, any node can be the entry point. Because later nodes depend on earlier decisions, execution always continues through to the end from whatever node is selected.
3. Getting reliable structured output
Even using PydanticOutputParser.get_format_instructions() wasn't sufficient when the same agent was also responsible for deep research; asking one agent to do two demanding jobs degraded both. The Critic Pattern resolved this: a Nova Lite reviewer takes the node's raw output and reformats it into the correct Pydantic-compliant structure, restoring accuracy on both fronts.
4. Ending the Nova Sonic session at the right time The original design aimed for a multi-turn requirements-gathering conversation, capped at three rounds. The problem: Nova Sonic would start narrating the architecture design in voice as soon as it felt it had enough information, rather than cleanly handing off to the graph. The fix was to constrain Nova Sonic to one response only, with automatic interruption if the reply runs long.
Accomplishments We're Proud Of
1. Knowing when to step back from new technology Nova Act was the newest tool available, so we started there. We got it working, and then recognized that using it for cost calculation was consuming far more time and resources than the task warranted. Sometimes the right call is stepping back from a working solution because the cost of running it doesn't justify the result.
2. The Critic Pattern as a first-class design tool Applying the Critic Pattern to fix structural output issues in-place, rather than re-running expensive graph paths, is a meaningful architectural win. It keeps the system resilient without inflating cost or latency.
3. Heterogeneous LLM Architecture done right Using multiple models in one application is often a red flag for overengineering. This project demonstrated the opposite case: precisely matching model capability to task complexity lowered cost, reduced hallucination, and produced better outputs than a uniform Nova Pro approach would have.
What We Learned
1. How to make Nova Act more accurate using the ReAct pattern Nova Act alone struggled with ambiguous or blocked states, retrying indefinitely rather than reasoning about the situation. Introducing a dedicated ReAct node that observes the current state and decides the next action made it significantly more reliable. The key insight: don't just act, reason first.
2. How to build a real-time voice AI agent Building with Nova Sonic was a new challenge that came with its own set of unexpected behaviors around session control and response timing.
3. The Critic Pattern as a practical tool for structured output reliability Asking one agent to handle both deep research and structured formatting was too much; output quality on both fronts suffered. The Critic Pattern, using a lightweight Nova Lite reviewer, separates these concerns cleanly. It also means a minor formatting issue doesn't trigger a full graph re-run.
4. When and how Heterogeneous LLM Architecture is the right call The instinct to avoid mixing multiple models is usually sound, but this project showed the exception. When tasks vary meaningfully in complexity, matching each agent to the right model improves accuracy, reduces hallucination, and cuts cost. The lesson isn't "use more models", it's "use the right one for each job."
What's Next for ArchiFlow
- Live AWS pricing integration - connecting directly to Amazon's pricing data for real cost estimates
- Cloud services knowledge base - replacing browser-LLM knowledge with a curated database of AWS services and descriptions for more grounded recommendations
- Richer architecture diagrams - moving from Mermaid to generated images, while avoiding the trap of over-constraining the model with examples that narrow the output space
- General-purpose cloud assistant mode - extending ArchiFlow beyond architecture design so users can, for example, submit an existing architecture and get pricing details back. This requires converting user input into the state parameters each node expects, and a smarter routing mechanism to direct requests to the right node without a full graph run
#Amazon-Nova #Nova2-Sonic #Nova-Pro #Nova2-Lite #Nova-Micro
Built With
- fast-api
- langgraph
- nova-2-lite
- nova-2-sonic
- nova-micro
- nova-pro
- python
- streamlit(for-demo-purpose)

Log in or sign up for Devpost to join the conversation.