Quorum

Review of an MR on Gitlab by using Agent Playground on Google Cloud Run
Quorum findings for a zero coordinated related bug Pull request on Github
Comment posted by Quorum on a open source Github repo PR
Merge Request Opened by Quorum on GitLab
Quorum finding was substantial to be considered to be added by contributor of containerd/nerdbox
Quorum opened a comment for containerd/nerdbox and the given fix was added and the PR is merged
Real Open source repos across Github/Gitlab reviewed by Quorum
Real Open source repos across Github/Gitlab reviewed by Quorum

Inspiration

I'm early in my backend career, and the bugs that scared me most weren't the ones the compiler caught — they were the distributed ones. The first time I read Kleppmann on distributed locks, I realised a Redis lock I'd shipped with a static "locked" value was quietly broken. It passed review. It passed the linter. Nobody noticed, because the bug wasn't in my code — it was in what my code was missing. Linters can't see that, and generic AI reviewers only read the diff. So I built the reviewer I wished I'd had as a junior — one that checks the rest of the codebase before trusting the diff.

What it does

Quorum reviews GitLab MRs and GitHub PRs for 14 named distributed-coordination anti patterns — fencing tokens, saga compensation, retry jitter, lost updates, Kafka offsets, missing DLQs, cascading timeouts, and more. Each rule comes from a real source (Kleppmann, AWS, Confluent, the Google SRE book), so a finding teaches you why it's wrong, with a link and a fix.

How I built it

Two phases: a free regex scan of the diff that exits before spending a token if nothing smells off, then a Gemini 2.5 Pro agent loop that calls GitLab's MCP tools (semantic_code_search, get_file_contents) round after round — searching the repo, reading files, reasoning — until it's sure. That MCP search is the whole point: it's how the agent confirms a rollback handler is genuinely absent from the whole project, not just missing from the diff. Rules are data, not code, so adding one is a single-file PR. It runs three ways on Google Cloud: an interactive Vertex AI Agent Engine agent (Google ADK), a Cloud Run webhook, and a CLI — plus Quorum Live, a hosted page that streams the agent's thinking live.

Challenges I ran into

Real MCP search was hard to reach — GitLab's HTTP MCP needs an OAuth session I couldn't script, so I fell back to the glab mcp serve CLI with a token plus two more client tiers.
The Agent Engine Playground is ADK-only — my first deploy used the older API and got refused; I re-wrapped everything in a Google ADK app.
False positives kill trust — so I ran Quorum against GitLab's own production repos and tuned until they all correctly passed.

How Quorum is architected

Quorum is one review engine that runs in three places, with a strict split between thinking and acting.

The review pipeline (two phases).

Surface detector — free, instant. A pure-Python regex/keyword scan of the diff. A typical CRUD merge request touches no coordination pattern, so it exits in milliseconds with zero API calls. Gemini is only woken up when the diff actually smells of distributed coordination.
Gemini agent loop — only when needed. When a surface fires, Gemini 2.5 Pro (with thinking_budget and Google Search grounding) drives a multi-round investigation, calling GitLab's MCP tools — semantic_code_search, get_file_contents, get_merge_request — over and over until it can emit structured JSON findings, each with a severity, a self-reported confidence score, and a reference.

Rules are data, not code. Every one of the 14 rules is a single Python dataclass — keywords, regex, search-query templates, and reasoning guidance injected into Gemini's prompt. There are no if/else detection chains; the intelligence lives in the prompt. Adding RULE_15 is a one-file pull request.

Thinking vs. acting are separated on purpose. Gemini only investigates — it never writes. Deterministic Python decides when to post a comment, apply labels, or open a fix MR, then watches the pipeline to verify the fix passes CI. You never want a model autonomously writing to someone's repository.

Three deployment modes, one engine.

Mode	Surface	Google Cloud
Interactive	Agent you can chat with, built on the Google ADK	Vertex AI Agent Engine
Automated	GitLab/GitHub webhook — reviews on every push, plus the Quorum Live SSE demo page	Cloud Run
Power-user	`quorum` CLI + Python SDK for CI, with SARIF output	Local / any CI

Resilience touches: a confidence threshold suppresses low-signal findings; a three-tier MCP client (official glab CLI → @zereight npm → REST) means it works whatever is installed; and a tool-call recovery loop nudges Gemini back to structured function calls when context caching makes it emit tool calls as plain text. Secrets live only in Secret Manager — never in the image.

Accomplishments I'm proud of

Quorum found real coordination bugs in 9 open-source projects across Java, Go, Python, and TypeScript — all filed publicly, zero false positives. One fix, on containerd/nerdbox PR #218, got merged after Quorum's comment.

What I learned

The unit of AI code review isn't the diff — it's the investigation. The second I let Gemini drive its own MCP search loop instead of judging lines in isolation, the false positives dropped and it caught the exact bugs I'd been afraid of. And the trust comes from the boring part: plain Python decides when to post or fix — never the model.

Future goals

Quorum is built to grow from a one-shot reviewer into a persistent collaborator.

Public, no-login chat playground. The Agent Engine Playground is IAM-gated, so outside viewers can't open it. A public chat proxy will stream the live ADK agent through the /demo page over SSE using the Cloud Run service account — anyone can converse with the reviewer, no Google account needed (with rate-limiting and a per-session cost cap).
Close-the-loop follow-up. A quorum follow-up command will re-check every issue Quorum has filed: re-run the surface detector at the commit that closed it, and if the anti-pattern is gone, post a final "Fixed in <sha> — confirmed by Quorum." This turns it from a reviewer into a teammate that verifies its own findings get resolved.
Multi-tenant SaaS. OAuth login so Quorum acts on each user's own token (no shared PAT), a per-org dashboard of findings by rule/severity and fix-acceptance rate, and a persistent Firestore/PostgreSQL backend for trend analysis and regression detection across deployments.
A growing taxonomy. Because every rule is a standalone, single-file contribution, the rule set is designed to expand past 14 with the community — covering more coordination patterns across more languages.

Built With

alpine.js
click
cloud-build
cloud-run
docker
fastapi
gemini-2.5-pro
github-actions
github-rest-api
gitlab-ci
gitlab-mcp-server
gitlab-rest-api
glab-cli
google-adk
google-agent-engine
google-genai
litellm
model-context-protocol-(mcp)
pydantic
python
sarif
secret-manager
server-sent-events
structlog
tailwind-css
tenacity
uvicorn
vertex-ai

Updates

Kaustubh Upadhyay started this project — Jun 11, 2026 01:37 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.