Inspiration

ACM@UCLA was spending $3,600 a year on AWS for infrastructure that could run on $300 of hardware sitting in our office. A 12x premium to rent what we could own. And we weren't alone — every club, every startup, every side project is feeding money into AWS, Vercel, and DigitalOcean for compute that should cost almost nothing.

But cost was only half of it. The other half was friction. Spinning up a VM at ACM meant SSH-ing into a Proxmox node, manually cloning a template, editing cloud-init by hand, hunting down a free IP, and praying the network config took. Twenty minutes of tribal knowledge for something AWS does in a click. Most of our members never bothered — they'd just spin up an EC2 instance on a personal card and expense it later. The hardware was sitting there, paid for, idle, because the path to using it was too painful.

And then there's the data. Every project we hand to AWS is one more piece of our work living on someone else's servers, under someone else's terms, subject to whatever pricing or policy change ships next quarter. For a student org sitting on perfectly capable hardware, none of that needs to be true.

We built Nimbus because we were done renting. The hardware exists. The software should make it as easy to use as the cloud — and then there's no reason to pay the cloud anymore.

What it does

Nimbus is an open-source VM provisioning platform that gives you the EC2 experience on hardware you own. Two clicks. Thirty seconds. A running Linux VM with a static IP and SSH credentials — no AWS account, no API key, no data leaving the room.

The moment you provision, Nimbus integrates with Gopher — our existing open-source reverse-tunnel gateway — to expose your VM at a stable HTTPS subdomain instantly. Unlike Cloudflare Tunnels or ngrok, traffic never leaves hardware you own. Your VM is reachable at your-project.uclaacm.com before you've opened your SSH client. Caddy handles TLS automatically.

Need object storage? Nimbus provisions a MinIO bucket alongside your VM — standard S3 API, fully local, no egress fees.

Running AI in the cloud is all the rage right now. We do it locally. Nimbus provisions Docker containers directly on the NVIDIA GX10, a petaFLOP-class AI supercomputer, with the same two clicks. Load model weights from MinIO, run inference on the GX10, write results back. Air-gapped if you want it.

How we built it

Nimbus is a single Go binary with the React SPA embedded at compile time. One process, one SQLite file, zero external dependencies. The AWS console, but you can scp it to a server.

It runs on any Proxmox cluster — for the hackathon we brought our own: 5 repurposed laptops, a router, and a GX10. Provisioning is a 9-step orchestration over the Proxmox REST API: validate, reserve IP, score nodes, clone, inject SSH and network via cloud-init, resize, boot, poll qemu-guest-agent, persist. 30–60 seconds end to end.

VMs optionally register with Gopher, our prior reverse-tunnel project. The VM dials out over rathole; Caddy terminates TLS with Let's Encrypt and proxies the public hostname through the tunnel.

The GX10 chip isn't partitionable, so VMs don't get direct GPU access. Instead it runs an always-on inference server (OpenAI-compatible) and a job worker for containers with --gpus all. From the VM's perspective the GPU is just a service.

Challenges we ran into

The Proxmox API has the documentation of a 2008 PHP forum thread. It expects application/x-www-form-urlencoded, not JSON. SSH keys must be URL-encoded with %20 because + is treated as literal. The clone endpoint silently falls back to the source node unless you also pass target=.

Cloud-init silently fails when the template lacks a cloudinit drive. API returns 200, applies nothing. We now probe the template and fail loudly instead of letting users discover they can't SSH in.

Two concurrent provisions can grab the same IP. We rebuilt the pool as a cache with a reconciler that treats Proxmox as the source of truth.

The GX10 destroyed our first GPU plan. We assumed PCIe passthrough. Then we learned the chip is unified Grace+Blackwell silicon and doesn't partition. Rethought the whole GPU plane mid-hackathon.

Accomplishments that we're proud of

We open-sourced AWS in 36 hours — multi-node cluster, automatic node selection, cloud-init, public tunnel networking with automatic TLS, S3-compatible object storage, and GPU compute — built from scratch on five laptops carried in that morning.

ACM@UCLA was spending $3,600 a year on AWS. With Nimbus, that drops to the cost of the hardware. One time.

What we learned

Hyperscalers don't actually share GPUs. EC2 hands out whole cards; "sharing" happens at the API layer. The only real shared-GPU pattern is what SageMaker and Bedrock do — put a service in front of the GPU. Once we accepted that, the GX10 design got simple.

Single-writer SQLite is enough. MaxOpenConns=1 with proper transactions handles every concurrent provision the cluster can produce. Our deployment is cp nimbus /opt/nimbus.

The hard part of self-hosted isn't compute — it's the network. Spinning up a VM on bare metal is the easy half. Reaching it from the internet without router config is what kills most self-hosted projects.

What's next for Nimbus

VM expiry and renewal, web SSH in the portal, DNS integration, multi-cluster federation, and a one-command installer that works on any Proxmox cluster in under five minutes.

The cloud had a good run. We're done renting.

Built With

Share this project:

Updates