Inspiration

Cloud bills balloon because Terraform configs drift from real usage instances stay oversized long after the traffic spike that justified them. We wanted that gap to close itself. What it does

Watches .tf changes via GitHub webhooks, pairs each snapshot with live metrics, and an AI agent opens a pull request right-sizing instances, replicas, and storage tiers.

How we built it

FastAPI + async SQLAlchemy backend, in-process asyncio pub/sub event bus, React + Vite console over WebSockets, Postgres for history, Terraform-provisioned DigitalOcean/Snowflake infra.

Snowflake We used:

  • Stages to upload raw JSONs for processing
  • Pipes to ingest data into SQL tables
  • Tables for storing data between workflows
  • Stored Procedures/Tasks to process data
  • Cortex AI to summarize and explain changes

Digital Ocean We used:

  • /v2/sizes/ endpoint to get costs for Droplets
  • /v1/chat/completions with nemotron-nano-12b-v2-vl model as backup

ElevenLabs We used ElevenLabs with Twillio to call the user to inform them of the changes and user can approve it.

Challenges we ran into

Idempotent webhook ingest under GitHub retries, modeling a clean run state machine, streaming live progress without a worker queue, and keeping the agent's suggestions actually
safe to merge.

Cortex AI decided to stop working in the middle of the night, so we had to spin up a Digital Ocean AI endpoint.

Snowflake responses that a job is running, but not that it finished. It makes it a challenge to sequentially run tasks.

Accomplishments that we're proud of

End-to-end flow working: webhook → snapshot → AI run → live UI → PR. HMAC-validated, dedup-safe, fails closed on misconfig. Every stage is observable, nothing is a black box.

What we learned

Async SQLAlchemy 2.x patterns, designing event-driven systems without Redis, prompt engineering for infra reasoning, and that PRs beat dashboards as the surface for AI suggestions. What's next for InfraLens

A real worker to drive runs through running → suggestion_ready → pr_opened, Redis fan-out for multi-worker scale, richer metric sources, and policy guardrails on agent
suggestions.

Built With

Share this project:

Updates