Inspiration Code review is the biggest bottleneck in software development. Developers wait hours for reviews, reviewers context-switch constantly, and critical bugs slip through when reviews are rushed.
What it does EvoFlow is a self-growing AI development pipeline on the GitLab Duo Agent Platform. 11 specialized flows and 1 agent handle the entire development lifecycle:
Suggest analyzes the codebase and proposes features Implement reads issues, writes code, creates branches and MRs Review, Tech Debt, and Optimization each review from a different angle — security/bugs, maintainability, and performance Compare aggregate and compares reviewers. Merge Check creates tracking issues for non-blockers and reports merge readiness Batch Triage deduplicates and closes resolved issues
A GCP enrichment layer (Gemini 3.1 + Skills DB on firestore + Vertex AI Search across various sites) provides external intelligence via label-based polling to provide extra context for reviewer/implementation. An AI Autopilot powered by Gemini Pro reads the full project state and orchestrates flows in semi-auto or full-auto mode from a Cloud Run dashboard.
How we built it Started with a single review flow. Iterated the prompt through and finally came up with agentic loop (triage → plan → execute → reflect → validate) that helped with increasing number of actionable findings.
Built a full GCP stack — Firestore with knowledge documents + vector embeddings, Cloud Functions with 7 API endpoints, Vertex AI Search indexing 19 coding sites.
We noted all 89 platform tools on Gitlab, documented every limitation, and built 10 more flows. Each flow has a single responsibility and communicates through MR/issue notes.
Built the Cloud Run dashboard with Gemini-powered enrichment, the AI Autopilot with semi/auto modes, and Cloud Scheduler for automated polling to check for new issue/mr to enrich.
Challenges we ran into No internet from flows — Duo flows run server-side with zero external access. So we have to poll to notice changes on Gitlab. Flows can't trigger flows — Service account @mentions are ignored. We tried 3 different orchestrator approaches before accepting manual triggers and building the Autopilot on Cloud Run instead. Multi-component flows — Context passing between components works, but downstream components fail to post notes. Reverted to single-component architecture. CI blocked — The Pipeline Execution Policy blocks all custom CI jobs. No external agents, no custom pipelines. Accomplishments that we're proud of 11 flows + 1 agent — Each specialized, each with a single job. GCP enrichment actually works — Gemini generates targeted search queries, Skills DB returns relevant patterns, Vertex AI Search finds real documentation, and Gemini summarizes it all into actionable notes. The full loop: label → poll → enrich → swap label. AI Autopilot — Gemini Pro reads the project state and decides what to do next. Semi-auto for oversight, full-auto for hands-off operation. Every decision logged with reasoning. What we learned Start with constraints, not aspirations. We spent significant time building GCP services before discovering flows can't reach them. Understanding the platform limits first would have saved a day. Specialized agents beat general ones. One review flow trying to check security, tech debt, and performance produced mediocre results. Three specialized flows each do their job well. The codebase is the best knowledge base. Searching existing code patterns with gitlab_blob_search provides better context than any external service. The code tells you how the project handles auth, validation, and errors. Validation eliminates false positives. The single highest-impact improvement was adding "read the actual code, trace the execution path, drop unreachable findings" before posting. Flows communicate through notes. Since flows can't call each other, MR/issue notes become the message bus. Each flow reads previous findings and builds on them. What's next for EvoFlow Smarter Autopilot — Learn from past decisions. Track which flow sequences produce the best outcomes and adjust recommendations. Cross-repo patterns — Use the Skills DB to share security patterns across projects. "How do other repos handle JWT validation?" answered from 98K real patterns. Metrics flow — Aggregate review data across all MRs: precision rate, suggestion acceptance rate, time-to-merge improvement. Measure if EvoFlow actually reduces developer friction. MCP integration — When GitLab releases MCP support in flows, connect directly to GCP services for real-time enrichment instead of async label polling. Multi-project orchestration — Deploy EvoFlow across multiple repos with a shared dashboard, comparing review quality and development velocity across teams.
Built With
- cloudrun
- firestore
- gcp
- gitlab
- python
- yaml
Log in or sign up for Devpost to join the conversation.