Inspiration

CloudArena has been stuck in my head since the first semester of my master’s at CU Boulder, right after two years at Adobe babysitting production dashboards. Those shifts left me staring at cloud charts and wondering why the basics feel so hard when cloud is everywhere and anyone can spin up an environment in minutes. Most folks never get a real shot at practicing what “secure enough” looks like. The MLH Open Source Hackfest finally pushed me to unplug from everything else, carve out about 35 hours, and turn the idea into the "cloud safety range" I wanted for myself and for anyone trying to do the right thing with their own infrastructure. My hope is that CloudArena gives people a space to experiment, see what breaks, and understand the fix without wading through endless documentation.

What It Does

Hit Create Run and CloudArena lights up a disposable AWS sandbox, walks through the playbook step by step, and streams the whole thing into the FastAPI + HTMX console. By the end you’ve got a concise report that ties every issue back to MITRE ATT&CK and spells out the fix. Today the demo runs against a Terraform stack I built with intentional flaws, but the same flow can point at any Terraform config so teams can rehearse against their real infrastructure. It’s a way to learn, experiment, and break stuff without risking production.

How I Built It

  • FastAPI handles the API and the HTMX console, with SQLite quietly tracking every run and event.
  • Celery runs the missions with a mix of boto3 sweeps I wrote and Stratus Red Team detonations.
  • A small planner cracks open the catalog and builds a runbook based on what the sandbox actually looks like.
  • Terraform lays down intentionally vulnerable AWS pieces for the demo so there’s something real to poke at, and the same setup can be swapped for any Terraform config when a team wants to test their own environment.
  • Auth0 keeps sign-in smooth, and Gemini rewrites the summary so it reads like an executive briefing.

All of this came together in two mostly sleepless days on VS Code and coffee.

Challenges Along the Way

Every piece pushed back. Terraform needed its own dance, Auth0 wanted different callbacks, Celery queues stalled, Gemini nitpicked model names, and the Stratus CLI caught every missing flag. I wanted LLM-driven agents running the show, but putting that in AWS safely with a weekend deadline wasn’t realistic, so I stuck with a deterministic planner I could reason about. Time was the choke point: wiring CI, writing tests, and getting clean automation all had to wait while I forced the end-to-end loop to behave, and even that was more tangled than I expected.

Wins I’m Proud Of

  • This is the biggest thing I’ve ever shipped solo in a weekend.
  • The console feels like a tool I’d actually keep open, not just a demo slide.
  • The run pipeline surfaces real issues with follow-up guidance, so the loop feels useful instead of gimmicky.

Lessons Learned

  • AI pair programmers help, but only when you already know how the pieces fit.
  • Scripted runs still land when the UX is clean and the report actually makes sense.
  • Scope creep never sleeps; sticking to the MVP kept the project from blowing up.

What’s Next

  • Replace the deterministic planner with an LLM-driven one so the playbooks adapt on the fly.
  • Backfill tests and wire up CI so future changes aren’t guesswork.
  • Grow the technique catalog, Stratus mappings, and remediation notes to cover more real-world messes, and streamline importing any Terraform config so other teams can test their own environments the way I test mine in the demo.

Built With

Share this project:

Updates