Merge Memory: MR history for every code review

Overall Architecture
System Architecture

Inspiration

We watched senior engineers do code review. The best ones weren't reading the diff more carefully — they were pulling up related MRs, grepping through history, and cross-referencing pipeline failures from six months ago. That institutional memory lived in their heads, not in the tooling.

Newer teammates either spent 20 minutes reconstructing that context manually, or they approved without it. Both options are expensive. The first doesn't scale. The second is how regressions ship.

GitLab Orbit changed what's possible here. A structured, queryable knowledge graph of your codebase means you can finally ask: what else co-changes with this file? What broke last time someone touched this module? Who actually owns this code? We built Merge Memory to surface those answers automatically, at exactly the moment reviewers need them.

What it does

Submit a GitLab MR URL. Merge Memory fetches the diff, queries Orbit for file relationships and history, runs a deterministic evidence pipeline, and produces a structured report in seconds:

Why this code exists — historical MRs that touched the same files, scored by recency and hotfix signals
Blast radius — files and modules that depend on the changed code, ranked by Orbit graph distance
Regression signals — hotfix patterns, failing pipelines, and missing test coverage
Suggested reviewers — ranked by file ownership, MR history, and recent related activity
Required tests — derived from historical co-changes and pipeline failure patterns

Every claim carries a traceable evidence array pointing to the exact API call and data item. Nothing is invented.

How we built it

The core pipeline is entirely deterministic — no LLM in the analysis. Seven builders each query GitLab REST or Orbit, score their results, and attach structured evidence references. GitLab Duo is post-hoc only: it receives a compact list of computed claim IDs and formats them into prose. Responses citing unknown IDs or inventing facts are rejected at the API boundary.

The stack: Next.js 15 App Router, TypeScript strict mode, Zod for every API response, PostgreSQL + Drizzle for report persistence, AES-256-GCM for token encryption, deployed on Cloud Run.

Orbit is the engine that makes the blast radius and co-change analysis possible. GitLab REST alone can tell you what changed. Orbit tells you what depends on it, what broke before it, and who touched it last quarter. That's the graph traversal that turns a diff into a risk assessment.

Challenges we ran into

Getting the evidence model right was the hardest design problem. Early versions let Duo summarize freely — and it hallucinated confidently. The fix was strict: Duo may only reference claim IDs that exist in the report, responses are validated against the computed claim set, and any violation returns an error rather than a degraded summary.

Orbit's schema required careful query design for blast radius — naive traversal returned too much noise. We built a scoring function combining graph distance, co-change frequency, import count, and security adjacency to produce a ranked, actionable list rather than a raw graph dump.

Accomplishments that we're proud of

Zero hallucinations by construction. Every displayed claim is traceable to a specific Orbit or GitLab REST response. The Evidence Inspector tab exposes the raw traces for every section of every report.

The pipeline runs in under 3 seconds on real MRs. Fast enough to be part of the review workflow, not a separate research task.

Five production-realistic mock scenarios ship out of the box so anyone can run the full experience locally with no credentials.

What we learned

Orbit is genuinely a different primitive than GitLab REST. The moment you can ask "what co-changes with this file across the last 200 MRs" and get a scored, ranked answer in milliseconds, a whole class of review automation becomes tractable. We spent the first few days building against REST and hit a ceiling immediately. Switching to Orbit unblocked everything.

LLMs are useful for formatting, not for reasoning over code history. The deterministic pipeline is faster, cheaper, and fully auditable. Duo earns its place as the last step — not the first.

What's next for Merge Memory: MR history for every code review

Webhook integration so reports generate automatically when an MR opens, without anyone having to paste a URL. Risk-gated merge rules: block approval until high-risk signals are acknowledged. And a GitLab Duo Skill published to the AI Catalog so any team can add MR context as a native skill in their review workflow.

Built With

docker
drizzle-orm
gitlab-duo
gitlab-orbit
gitlab-rest-api
google-cloud-run
next.js-15
postgresql
shadcn/ui
tailwind-css
typescript
vitest
zod

Updates

Younes Laaroussi started this project — Jun 24, 2026 01:42 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.