Project GeminiFlow: A Multi-Agent DevSecOps Co-Pilot
Inspiration
As a full-stack developer, my world typically revolves around front-end frameworks using React/React Native and shaping backend APIs using PHP Laravel. The intricate dance of CI/CD pipelines and cloud infrastructure was something I always found intriguing, but did not get down to it - personally being a very hands-on learner I thought that I need to get my hands dirty and learn this important aspect of the Software Development lifecycle.
Coincidentally, during a routine scroll through daily.dev, a came across an ad for the Google ADK Hackathon. I thought this was a good opportunity to not only learn the intricacies of DevOps but also Multi-Agent Systems. With a background in AWS but virtually no hands-on experience with GCP, and a tangential relationship with DevOps, this felt like the ideal self-imposed challenge.
The goal is firstly as always to learn and challenge myself, naturally an added bonus would be to produce something inspiring. I thought - how could a complex process like a modern DevSecOps pipeline be deconstructed and managed by a team of collaborating AI agents? This project, GeminiFlow, is the answer to that question. It represents a deliberate step into the deep end—a chance to learn GCP, delve into the ADK, and tackle a problem domain I don't live in every day, all for the sheer joy of the challenge.
What it does
GeminiFlow is a conversational, AI-powered co-pilot designed to automate and provide insights across the entire DevSecOps and FinOps lifecycle. A user can interact with it through a simple chat interface to execute complex, multi-step workflows that would normally require navigating multiple dashboards and tools.
At its core, GeminiFlow can perform four primary, high-level functions:
Automated DevSecOps Deployments: A user can issue a natural language command like, "Deploy our application from the main branch." The system then initiates a complete CI/CD pipeline, orchestrating multiple agents to fetch the source code, build and test the application, scan the resulting container for critical security vulnerabilities, and if all checks pass, deploy it to a Cloud Run environment. It even performs a post-deployment health check and can automatically roll back a bad deployment.
Infrastructure as Code (IaC) Provisioning: A user can ask to provision entirely new cloud environments. For example, "Plan a new staging environment called staging-1." GeminiFlow uses an Infrastructure Agent with Terraform to generate a plan, summarises it for the user using Gemini, and upon approval, applies the plan to create the new resources.
On-Demand Health & Monitoring Reports: Users can ask, "What is the health of our production service?" The system's Monitoring Agent gathers key metrics and logs, then uses Gemini to produce a concise, human-readable summary of the service's current state.
Financial (FinOps) Reporting: By querying cloud billing data exported to BigQuery, users can ask, "How much did we spend last week?" and receive a summarised report on project costs and a breakdown by service, generated by the FinOps Agent.
How we built it
GeminiFlow is built on a foundation of specialised AI agents, each an expert in its domain, orchestrated by a central Master Orchestrator Agent (MOA).
Core Framework: The entire system is built using the Google Agent Development Kit (ADK) in Python.
Multi-Agent Architecture:
- Master Orchestrator Agent (MOA): The "brain" of the system. It's an
LlmAgentthat uses a Gemini model to understand user intent and call the appropriate high-level tools to initiate workflows. - Sub-Agents: The MOA orchestrates a team of specialised agents, including:
- SCA (Source Control Agent): Interacts with GitHub via
PyGithub. - BTA (Build & Test Agent): Manages CI jobs using the
google-cloud-buildclient library and analyses test results. - SecOps Agent: Queries the
google-cloud-container-analysisAPI for vulnerabilities and uses Gemini for summarisation. - DA (Deployment Agent): Deploys services using the
google-cloud-runclient library. - MDA (Monitoring & Diagnostics Agent): Fetches data using
google-cloud-monitoringandgoogle-cloud-logging. - Rollback Agent: Performs rollbacks using the
google-cloud-runclient library. - FinOps Agent: Queries billing data with the
google-cloud-bigqueryclient library. - Infra Agent: Runs Terraform plans and applies them by submitting jobs to Cloud Build.
- SCA (Source Control Agent): Interacts with GitHub via
- Master Orchestrator Agent (MOA): The "brain" of the system. It's an
LLM Integration: We used Gemini models via the Vertex AI endpoint for two main purposes:
- NLU & Tool Selection: The MOA uses Gemini to parse user requests and decide which workflow tool to call (e.g.,
execute_smart_deploy_workflow). - Summarisation: Several agents (like SecOps and BTA) use Gemini to transform raw, complex data (vulnerability lists, test failures) into concise, human-readable summaries.
- NLU & Tool Selection: The MOA uses Gemini to parse user requests and decide which workflow tool to call (e.g.,
Hosting & API: The application is packaged into a Docker container and deployed as a serverless web app on Google Cloud Run. The backend is served by FastAPI, which provides a streaming API endpoint to give the user real-time feedback. The frontend is simple HTML, Tailwind CSS, and vanilla JavaScript.
Google Cloud Services: The project relies heavily on a suite of GCP services, including Cloud Build, Artifact Registry, Cloud Run, Artifact Analysis, Cloud Storage (for artifacts and Terraform state), IAM (for secure service accounts), BigQuery (for billing data) and Cloud Storage Bucket (to store test results for the Llm to scan and give us feedback).
Challenges we ran into
The development journey was filled with valuable learning experiences, many of which came from debugging unexpected behaviour.
Client Library Nuances: We encountered several
AttributeErrorandTypeErrorexceptions when using the Google Cloud client libraries. We learned that documentation snippets can sometimes be for different library versions and that the internal structure of these libraries can vary. This taught us the importance of reading the documentation extensively and relying on our own environment's error messages as the source of truth.The "PATH" Substitution Error: A particularly stubborn
400 InvalidArgumenterror from Cloud Build was traced back to theexport PATHcommand in acloudbuild.yamltest step. When run programmatically via the API, the Cloud Build service was incorrectly interpreting this shell command as an attempt to set an invalid top-level build substitution. The fix was to remove theexportand call the tool via its absolute path within the container.Including CLOUD_LOGGING_ONLY: We learned that there is no default logging behaviour and that this needs to be specified in our
cloudbuild.yamloptions.IAM Permission Chains: We learned a lot about the intricacies of IAM. It wasn't enough for the MOA's service account to have permission to call Cloud Build; it also needed permission to impersonate the service accounts used by the build triggers (
iam.serviceAccounts.actAs). Similarly, we had to ensure the Cloud Run runtime service account hadArtifact Registry Readerpermissions, not just the service account that initiated the deployment.
Accomplishments that we're proud of
Building a True Multi-Agent System: We successfully designed and implemented a system with eight specialised agents that collaborate under a central orchestrator to perform complex, end-to-end tasks.
Complete DevSecOps Automation: The "Smart Deploy" workflow is a fully automated pipeline, from commit to a live, security-scanned, and health-checked deployment, complete with automated rollbacks.
Intelligent Summarisation: We didn't just automate tasks; we used Gemini to add a layer of intelligence, transforming raw technical data (vulnerabilities, test failures, metrics, costs) into summaries that are genuinely helpful and easy to understand.
Full-Featured UI: We created a responsive, web UI that provides real-time feedback to the user, making the system feel interactive and alive, especially during long-running tasks.
Resilience and Debugging: We are proud of having overcome numerous complex technical hurdles related to API versions, IAM policies, and runtime environments. This resulted in a more robust and well-understood final product.
What we learned
This hackathon was an immense learning experience. Our key takeaways include:
Deep Practical Knowledge of ADK: We moved from zero knowledge to building a complex, hierarchical multi-agent system.
The Nuances of GCP IAM: We learned that understanding the principle of least privilege is one thing, but implementing it across multiple interacting services and service accounts requires careful attention to detail, especially regarding impersonation (
actAs) permissions.The Power of LLMs as "Sense-Makers": The most exciting discovery was how effective Gemini is at acting as a "universal parser" for summarising different kinds of technical data, from Terraform plans to test results, making them accessible to users.
Infrastructure as Code with Terraform": : As a developer without daily DevOps responsibilities, this project was my first real-world application of Terraform. I learned how to define resources, manage remote state in GCS, and execute plans and applies securely through a CI/CD pipeline, moving from theory to practical implementation.
What's next for GeminiFlow
This project is a solid foundation, and we're excited about its potential. The next steps would be:
Generalise the System: Implement a database (like Firestore) to allow users to connect their own GitHub repositories and configure their own deployment targets, transforming GeminiFlow into a true multi-tenant platform.
Expand Agent Capabilities: Add more tools to the existing agents, such as allowing the DA to manage traffic splitting or the SCA to create pull requests.
Refine the UI: Build a more sophisticated user interface with features like viewing historical logs, visualising metrics, and managing configured repositories.
Log in or sign up for Devpost to join the conversation.