AutoStack - The Autonomous AI Engineering Team

Inspiration

Building software today is fragmented. Developers juggle code, infrastructure, testing, and documentation, often context-switching between IDEs, cloud consoles, and project management tools. We asked ourselves: What if we could simulate an entire engineering team?

We were inspired by the potential of Agentic AI to move beyond simple code completion to full-scale project ownership. We wanted to create a system where "Provision a GKE cluster" or "Build an E-commerce API" wasn't just a prompt, but a trigger for a coordinated swarm of specialized agents to plan, execute, and verify the work—just like a real team.

What it does

AutoStack is a dual-track autonomous agent system:

  1. Software Development Workflow: A team of 4 agents (PM, Developer, QA, Documentation) that takes a high-level idea, researches the best tech stack, plans the architecture, and iteratively generates, tests, and documents the code.
  2. Infrastructure Provisioning Workflow: A triad of agents (InfraArchitect, DevOps, SecOps) that designs cloud architecture, writes Terraform code, scans it for security vulnerabilities (Checkov), and estimates costs (Infracost) before a single resource is provisioned.

It features a Real-time Dashboard to watch the agents think, plan, and code in real-time, giving users "God-mode" visibility into the AI workforce.

How we built it

We built AutoStack using a modern, scalable stack designed for agentic workflows:

  • Orchestration Engine: We used LangGraph to build stateful, cyclic graphs. This allows our agents to loop back, correct their mistakes, and ask for specific feedback—breaking linear "chain" limitations.
  • Intelligence: We leveraged Groq (Llama 3 & Qwen) for lightning-fast inference, critical for keeping the multi-agent conversation fluid.
  • Backend: FastAPI serves as the control plane, managing the WebSocket connections and persisting state to PostgreSQL.
  • Frontend: A responsive Next.js 16 dashboard with TanStack Query for real-time state synchronization, allowing users to approve/reject plans and view live logs.
  • Tools: We integrated Terraform, Checkov, and Tavily Research directly into the agents' toolkits.

Challenges we ran into

  • Cyclic Dependency Hell: Designing a graph where agents could "go back to the drawing board" (e.g., QA failing a test sends it back to Developer) without getting stuck in infinite loops was incredibly tricky. We had to implement strict state transition logic and "retry budgets."
  • Context Window Management: With multiple agents passing code files back and forth, context limits were hit quickly. We built a semantic Interface Contract system, where agents only see the "public interface" of other files, keeping the context lightweight.
  • Structured Output Stability: getting LLMs to consistently output valid JSON for complex Terraform plans was difficult. We used Pydantic models with rigorous validation loops to force the models to self-correct.

Accomplishments that we're proud of

  • The "Dual-Track" Architecture: We successfully successfully decoupled software logic from infrastructure logic, allowing them to run in parallel or independently.
  • Self-Healing Code: watching the Developer agent write code, the QA agent fail it, and the Developer agent actually fix it without human intervention was a magic moment.
  • Security-First Infra: Integrating Checkov means our AI doesn't just write code; it writes secure code. It won't let you deploy an S3 bucket with public access unless you explicitly fight it.

What we learned

  • Agents need "Sleep": We realized that continuously running agents drift in quality. Implementing "checkpoints" where the system pauses for human review actually improved the final quality significantly.
  • Specialization > Generalization: A "DevOps Agent" performed 10x better than a generic "Coder Agent" prompted to do DevOps. Giving them distinct personas and tools was key.

What's next for AutoStack

  • Visual Studio Code Extension: Bringing AutoStack directly into the editor.
  • Multi-Cloud Support: expanding the Infra workflow to support AWS and Azure.
  • Deploy-to-Prod: Integrating a CI/CD pipeline runner to actually deploy the generated Terraform/Code to live environments.

Built With

Share this project:

Updates