Inspiration
Infrastructure-as-Code (IaC) generation holds significant promise for automating cloud infrastructure provisioning. However LLMs are not accurate with generating IaC templates; evaluation reveals that state-of-the-art LLMs perform poorly, with Claude-3.5 and Claude-3.7 achieving only 30.2% and 26.8% deployment success. Current State:
- 42.7% of syntactically correct templates fail deployment
- only 25.2% of templates fully match user intent
- only 8.4% of deployable templates pass security compliance
We want an agentic system that optimizes for deployability, not just code syntax, and converges to a safe, compliant, running stack on AWS.
User Feedback
We interviewed founders and engineers from 20+ early-stage startups about DevOps on AWS. The consistent message was that the learning curve is steep and there is no single tool that guides teams from requirements to safe deployment. Current LLM provides good intructions, but it still requires significant time and hands-on knowledge to follow those steps.
What it does
Vibe DevOps is an agentic cloud architect for AWS. You describe requirements in natural language. The system proposes an AWS architecture, generates IaC, validates it, and opens a reviewable deployment plan. It wires basic monitoring and cost estimates, visualizes the stack on a canvas, and supports gated deploys with rollbacks.
Key capabilities:
- Requirement extraction (services, SLAs, constraints).
- Architecture proposals aligned to AWS Well-Architected pillars.
- IaC generation (Terraform/CloudFormation) with a module graph.
- Pre-deployment checks (security, cost, compliance, drift).
- Deployment planning, apply, and rollback.
- Alarms/dashboards and cost delta estimates.
- Visual task tree and stack topology view.
How we built it
Multi-agent design (via Strands Agents SDK + Bedrock):
- Requirements Agent – extracts services/SLAs/constraints.
- Cloud Architecture Agent – proposes target AWS design.
- IaC Generation Agent – emits Terraform/CloudFormation and variables.
- Deployment Agent – plans, applies, rolls back; handles drift.
- Monitoring/Cost Agents – attach alarms/dashboards; estimate cost deltas.
- Vibe Orchestrator – coordinates agents and checks progress toward a deployable state.
Validation pipeline: Static checks (lint/format/schema) → dry-run/plan analysis → policy & security review (guardrails). Focus is on deployability and compliance.
Implementation highlights:
- Backend: FastAPI, PostgreSQL, WebSockets, structured logging.
- Frontend: React/TypeScript/Vite with canvas/kanban views and task tree.
- IaC: Terraform modules and plan inspection.
- Models: Amazon Bedrock (Claude/Nova) via Strands SDK.
- Patterns: Hierarchical task decomposition; concurrent sub-tasks with clear parent/child lineage.
- AWS deployment plan: containerized API on ECS Fargate, RDS PostgreSQL, Bedrock for LLM, ALB in front.
Challenges we ran into
We grouped issues into categories rather than single bugs:
- Requirement ambiguity – incomplete or shifting inputs.
- IaC correctness – syntax vs. deployability gaps.
- Architecture dependencies – ordering, networking, and IAM edges.
- Security & compliance – policies, least privilege, encryption, tagging.
- Performance & cost – rightsizing and cost-aware defaults.
- Observability – making agent decisions traceable and debuggable.
- State & drift – reconciling desired state with live AWS.
- Long-context reasoning – keeping conversations and plans coherent over time.
Accomplishments that we're proud of
- A working hierarchical task system with parent/child tasks, live updates, and clear lineage.
- A multi-agent loop that converges from requirements to a validated plan.
- A deployability-first validation flow that catches common IaC failure modes before apply.
- A visual canvas for the stack and a kanban view for agent tasks.
- Clean interfaces: REST/WebSocket APIs, typed contracts, and test coverage.
What we learned
- Optimizing for “compiles” is not enough; optimize for deploys and policies.
- Clear intermediate artifacts (requirements JSON, architecture spec, module graph, plan diff) make agent reasoning reviewable.
- Guardrails (security, cost, compliance) must be first-class and fast, or they won’t be used.
- Hierarchical tasks with explicit dependencies reduce confusion and speed up recovery from failures.
- Human-in-the-loop at PR/plan gates improves trust without blocking flow.
What's next for Vibe Devops - AWS
Launch this as an open source project, with:
- Reinforcement from deploy feedback: learn from failed applies and policy denials.
- Deeper policy packs: CIS/WAF/IAM least-privilege generation and auto-remediation.
- Broader IaC support: stronger Terraform module libraries; CloudFormation/CDK parity.
- Richer observability: traceable agent steps, plan diffs, and rollback timelines.
- Cost-aware planning: compare architectures by price/performance before deploy.
- Safer rollouts: built-in blue/green, canary, and automatic rollback strategies.
- Multi-account/multi-region: landing zones and environment promotion.
- Marketplace of templates: reusable patterns for common AWS workloads.
Built With
- amazon-bedrock-(claude/nova-pro)
- fastapi
- postgresql
- react/typescript/vite
- strands-agents-sdk
- terraform
Log in or sign up for Devpost to join the conversation.