-
-
Review of an MR on Gitlab by using Agent Playground on Google Cloud Run
-
Quorum findings for a zero coordinated related bug Pull request on Github
-
Comment posted by Quorum on a open source Github repo PR
-
Merge Request Opened by Quorum on GitLab
-
Quorum finding was substantial to be considered to be added by contributor of containerd/nerdbox
-
Quorum opened a comment for containerd/nerdbox and the given fix was added and the PR is merged
-
Real Open source repos across Github/Gitlab reviewed by Quorum
-
Real Open source repos across Github/Gitlab reviewed by Quorum
Inspiration
I'm early in my backend career, and the bugs that scared me most weren't the ones the compiler caught — they were the distributed ones. The first time I read Kleppmann on distributed locks, I realised a Redis lock I'd shipped with a static "locked" value was quietly broken. It passed review. It passed the linter. Nobody noticed, because the bug wasn't in my code — it was in what my code was missing. Linters can't see that, and generic AI reviewers only read the diff. So I built the reviewer I wished I'd had as a junior — one that checks the rest of the codebase before trusting the diff.
What it does
Quorum reviews GitLab MRs and GitHub PRs for 14 named distributed-coordination anti patterns — fencing tokens, saga compensation, retry jitter, lost updates, Kafka offsets, missing DLQs, cascading timeouts, and more. Each rule comes from a real source (Kleppmann, AWS, Confluent, the Google SRE book), so a finding teaches you why it's wrong, with a link and a fix.
How I built it
Two phases: a free regex scan of the diff that exits before spending a token if nothing smells off, then a Gemini 2.5 Pro agent loop that calls GitLab's MCP tools (semantic_code_search, get_file_contents) round after round — searching the repo, reading files, reasoning — until it's sure. That MCP search is the whole point: it's how the agent confirms a rollback handler is genuinely absent from the whole project, not just missing from the diff. Rules are data, not code, so adding one is a single-file PR. It runs three ways on Google Cloud: an interactive Vertex AI Agent Engine agent (Google ADK), a Cloud Run webhook, and a CLI — plus Quorum Live, a hosted page that streams the agent's thinking live.
Challenges I ran into
- Real MCP search was hard to reach — GitLab's HTTP MCP needs an OAuth session I couldn't script, so I fell back to the
glab mcp serveCLI with a token plus two more client tiers. - The Agent Engine Playground is ADK-only — my first deploy used the older API and got refused; I re-wrapped everything in a Google ADK app.
- False positives kill trust — so I ran Quorum against GitLab's own production repos and tuned until they all correctly passed.
How Quorum is architected
Quorum is one review engine that runs in three places, with a strict split between thinking and acting.
The review pipeline (two phases).
- Surface detector — free, instant. A pure-Python regex/keyword scan of the diff. A typical CRUD merge request touches no coordination pattern, so it exits in milliseconds with zero API calls. Gemini is only woken up when the diff actually smells of distributed coordination.
- Gemini agent loop — only when needed. When a surface fires, Gemini 2.5 Pro (with
thinking_budgetand Google Search grounding) drives a multi-round investigation, calling GitLab's MCP tools —semantic_code_search,get_file_contents,get_merge_request— over and over until it can emit structured JSON findings, each with a severity, a self-reported confidence score, and a reference.
Rules are data, not code. Every one of the 14 rules is a single Python dataclass — keywords, regex, search-query templates, and reasoning guidance injected into Gemini's prompt.
There are no if/else detection chains; the intelligence lives in the prompt. Adding RULE_15 is a one-file pull request.
Thinking vs. acting are separated on purpose. Gemini only investigates — it never writes. Deterministic Python decides when to post a comment, apply labels, or open a fix MR, then watches the pipeline to verify the fix passes CI. You never want a model autonomously writing to someone's repository.
Three deployment modes, one engine.
| Mode | Surface | Google Cloud |
|---|---|---|
| Interactive | Agent you can chat with, built on the Google ADK | Vertex AI Agent Engine |
| Automated | GitLab/GitHub webhook — reviews on every push, plus the Quorum Live SSE demo page | Cloud Run |
| Power-user | quorum CLI + Python SDK for CI, with SARIF output |
Local / any CI |
Resilience touches: a confidence threshold suppresses low-signal findings; a three-tier MCP client (official glab CLI → @zereight npm → REST) means it works whatever is installed; and a tool-call recovery loop nudges Gemini back to structured function calls when context caching makes it emit tool calls as plain text. Secrets live only in Secret Manager — never in the image.
Accomplishments I'm proud of
Quorum found real coordination bugs in 9 open-source projects across Java, Go, Python, and TypeScript — all filed publicly, zero false positives. One fix, on containerd/nerdbox PR #218, got merged after Quorum's comment.
What I learned
The unit of AI code review isn't the diff — it's the investigation. The second I let Gemini drive its own MCP search loop instead of judging lines in isolation, the false positives dropped and it caught the exact bugs I'd been afraid of. And the trust comes from the boring part: plain Python decides when to post or fix — never the model.
Future goals
Quorum is built to grow from a one-shot reviewer into a persistent collaborator.
- Public, no-login chat playground. The Agent Engine Playground is IAM-gated, so outside viewers can't open it. A public chat proxy will stream the live ADK agent through the
/demopage over SSE using the Cloud Run service account — anyone can converse with the reviewer, no Google account needed (with rate-limiting and a per-session cost cap). - Close-the-loop follow-up. A
quorum follow-upcommand will re-check every issue Quorum has filed: re-run the surface detector at the commit that closed it, and if the anti-pattern is gone, post a final "Fixed in<sha>— confirmed by Quorum." This turns it from a reviewer into a teammate that verifies its own findings get resolved. - Multi-tenant SaaS. OAuth login so Quorum acts on each user's own token (no shared PAT), a per-org dashboard of findings by rule/severity and fix-acceptance rate, and a persistent Firestore/PostgreSQL backend for trend analysis and regression detection across deployments.
- A growing taxonomy. Because every rule is a standalone, single-file contribution, the rule set is designed to expand past 14 with the community — covering more coordination patterns across more languages.
Built With
- alpine.js
- click
- cloud-build
- cloud-run
- docker
- fastapi
- gemini-2.5-pro
- github-actions
- github-rest-api
- gitlab-ci
- gitlab-mcp-server
- gitlab-rest-api
- glab-cli
- google-adk
- google-agent-engine
- google-genai
- litellm
- model-context-protocol-(mcp)
- pydantic
- python
- sarif
- secret-manager
- server-sent-events
- structlog
- tailwind-css
- tenacity
- uvicorn
- vertex-ai
Log in or sign up for Devpost to join the conversation.