Inspiration
The greatest bottleneck for developers—especially in high-stakes environments like hackathons or early-stage startups—isn't writing the application logic; it's the friction of configuring cloud infrastructure, CI/CD pipelines, and database schemas. We realized that while LLMs are great at writing boilerplate code, they fail at executing robust, multi-step deployment operations. We wanted to build an AI that doesn't just give advice, but acts as an autonomous Level-5 Site Reliability Engineer (SRE), actively configuring the cloud ecosystem so developers can focus strictly on the product.
What it does
OmniDeploy is an orchestration of three specialized agents operating in a continuous feedback loop:
The Architect Agent
Ingests the user's natural language requirements and designs a cloud architecture blueprint.
The Validator Agent (SRE)
Uses Agentic RAG connected to the latest Google Cloud and Supabase documentation to critique the Architect's design for security, scalability, and cost-efficiency.
The Executor Agent
Converts the approved blueprint into executable Terraform and GCP CLI scripts, autonomously dry-runs the deployment, and establishes the database schema and authentication flow.
How we built it
We utilized Next.js for a highly responsive, real-time dashboard where users can monitor the agents "debating" and executing tasks. The backend logic is driven by Node.js, leveraging the Gemini/Vertex AI API for function calling to interface directly with external cloud APIs.
For the knowledge base, we implemented an advanced Retrieval-Augmented Generation (RAG) pipeline using PostgreSQL (via Supabase) with the pgvector extension. This allowed our Validator Agent to instantly cross-reference generated infrastructure code against canonical documentation. Authentication and session management are also handled natively through Supabase, ensuring a secure and scalable foundation.
We mathematically optimized our agent's decision-making process by implementing an Architecture Confidence Score (ACS) before the Executor Agent is allowed to provision resources. The system calculates the probability of deployment success based on the semantic similarity of the proposed architecture (Ap) to canonical best-practice architectures (Ac) retrieved via RAG, combined with the inverse of the error rate (ϵ) during the CLI dry-run phase:
ACS = (α⋅∣∣Ap∣∣∣∣Ac∣∣Ap⋅Ac)+(β⋅1+ϵ1)
Where α and β are tuning weights for semantic validity and execution success, respectively. If ACS≥0.85, the infrastructure is autonomously provisioned.
Challenges we ran into
The primary challenge was "hallucination in execution." LLMs frequently invent configuration flags or use deprecated Terraform syntax that causes deployment failures. Standard prompting could not fix this. We solved this by implementing the "Multi-Agent Debate" framework. By forcing the Architect Agent to submit its code to the Validator Agent (which is strictly grounded by our RAG vector database), we reduced deployment syntax errors by over 80%.
Accomplishments that we're proud of
Successfully closing the loop between AI reasoning and real-world execution. Seeing the system take a one-sentence prompt, autonomously debate the optimal scaling strategy, and literally spin up Google Cloud Run instances and a functional Supabase backend without human intervention was a massive breakthrough for the team.
What's next for OmniDeploy
We plan to integrate cost-estimation algorithms so the agents can optimize not just for performance, but for a strict monthly budget, and introduce a "self-healing" feature where the agent monitors real-time server logs and autonomously scales infrastructure based on traffic spikes.
Built With
- gemini-api
- google-cloud
- google-cloud-run
- google-vertex-ai
- next.js
- node.js
- pgvector
- postgresql
- supabase
- terraform
- typescript
Log in or sign up for Devpost to join the conversation.