What started this
A payment service got deployed. Weeks of reviews, staging tests, QA sign-off. Everything looked fine.
Two days later, an order tracking service started silently dropping requests. Not crashing. Not throwing errors. Just quietly losing data.
The root cause was a function signature change in a shared module. A 2-year-old service in a different part of the codebase was calling it. Nobody on the team knew that service existed. The fix itself took about 20 minutes. Finding everything that was affected took most of a week.
That felt like a solvable problem.
The gap existing tools leave
Every team already has tools running on their code. Here is what they cover, and what they do not:
| Tool category | What it checks | Does it answer "what breaks"? |
|---|---|---|
| Linters / formatters | Syntax, style, formatting | No |
| SAST scanners | Security vulnerabilities | No |
| Dependency scanners | Known CVEs, outdated packages | No |
| API contract tools (Pact, Dredd) | Contract violations | Only if contracts are written upfront |
| Code review bots | Style, complexity, coverage | No |
| BCGuard | Semantic impact across consumers | Yes |
The API contract tools come closest. But they require both sides of the integration to maintain explicit contracts. In practice, most teams do not do that consistently, and the tools are useless when the contracts do not exist.
The real question — if I change this function, which other services are calling it and will they break? — requires reading the actual code, not checking it against a schema. It requires reasoning, not pattern matching.
What BCGuard catches
BCGuard detects 9 categories of breaking changes:
| Category | Example |
|---|---|
| Removed or renamed functions | getUserById deleted from public API |
| Modified function signatures | Parameter added without default, return type changed |
| Changed response shapes | Field removed or renamed in JSON response |
| Auth contract changes | Token format changed, new required header |
| Database schema changes | Column dropped, type changed, constraint added |
| Event/message contract changes | Queue message format modified |
| Config contract changes | Required environment variable renamed |
| Dependency major version bumps | library@2.x → library@3.x |
| Cross-service API changes | REST endpoint path, method, or payload modified |
These categories cover most of what breaks silently in production — the kind of change that passes all tests because the tests do not know about the consumer on the other side.
Architecture
BCGuard runs as two separate multi-agent flows on GitLab Duo Agent Platform.
Detection flow
MR opened / @mention
│
▼
┌─────────────────────────────┐
│ diff_analyzer │ READ only
│ │
│ get_merge_request │ Understand the MR context
│ list_merge_request_diffs │ Read every file change
│ get_repository_file │ Read file content with ref=branch
│ read_file │ Read local file content
│ find_files / list_dir │ Navigate repo structure
│ grep │ Search for patterns
│ │
│ Output: structured list │
│ of what semantically │
│ changed and why it matters │
└────────────┬────────────────┘
│
▼
┌─────────────────────────────┐
│ impact_scanner │ READ only
│ │
│ grep │ Find callers by function name
│ gitlab_blob_search │ Cross-file semantic search
│ list_repository_tree │ Map the full repo structure
│ find_files │ Locate consumer files
│ read_file │ Read consumer implementations
│ get_repository_file │ Read with branch context
│ │
│ Output: for each change, │
│ which files call it and │
│ how many consumers exist │
└────────────┬────────────────┘
│
▼
┌─────────────────────────────┐
│ report_writer │ WRITE
│ │
│ create_merge_request_note │ Post findings on MR
│ update_merge_request │ Set labels / metadata
│ │
│ Output: structured comment │
│ with severity, file paths, │
│ and consumer blast radius │
└─────────────────────────────┘
Autofix flow
Triggered separately after detection. Runs on the same MR.
@ai-bcguard-auto-fix ... Generate fixes for this MR
│
▼
┌─────────────────────────────┐
│ fix_generator │ READ only
│ │
│ list_all_mr_notes │ Read the BCGuard report
│ list_merge_request_diffs │ Read the diff
│ get_repository_file │ Read file on SOURCE branch
│ get_repository_file │ Read SAME file on TARGET branch
│ grep / find_files │ Locate affected consumers
│ │
│ Output: fix plan with │
│ OLD_CODE + NEW_CODE + │
│ WHAT_TO_DO per finding │
└────────────┬────────────────┘
│
▼
┌─────────────────────────────┐
│ fix_coder │ READ + WRITE
│ │
│ read_file │ Verify current file state
│ get_repository_file │ Double-check branch content
│ create_file_with_contents │ Write new files
│ edit_file │ Patch existing files
│ │
│ Output: modified files │
│ applied directly │
└────────────┬────────────────┘
│
▼
┌─────────────────────────────┐
│ fix_applier │ WRITE only
│ │
│ create_commit │ Commit all changes
│ create_merge_request │ Open fix MR → SOURCE branch
│ create_merge_request_note │ Comment on original MR
│ │
│ Output: fix MR ready for │
│ review, targeting feature │
│ branch not main │
└─────────────────────────────┘
Why two separate flows, not one
Combining detection and autofix into a single flow was the first design we tried. It had a problem: a single flow that both reads the codebase and writes fixes is hard to reason about from a permissions standpoint, and a bug in the write stage could corrupt files while the detection stage was still running.
Separating them means detection always runs safely. Autofix is opt-in. The write permissions are isolated to a distinct flow that only runs when explicitly triggered.
The hardest problem: giving fix agents real context
Early versions of the autofix flow had a subtle bug. fix_generator read the BCGuard comment text and tried to infer what the old and new code looked like from that summary. It would see something like:
HIGH: getUserById — signature changed, parameter `includeDeleted` added
Consumers affected: order-service/src/api.js, payment-service/src/handler.js
And it would try to reconstruct the function signature from that description. It was guessing. Sometimes it guessed right. Often it got the parameter types wrong, missed optional flags, or wrote a patch that would work for one consumer but not another.
The fix: fix_generator now reads the actual file content from both the source branch and the target branch before writing any plan.
# What fix_generator does now:
get_repository_file(file_path, ref=SOURCE_BRANCH) # the new version
get_repository_file(file_path, ref=TARGET_BRANCH) # the old version
This gives the agent a real diff. It sees exactly what changed, not a summarized description. The fix plan includes the actual old code and the actual new code side by side. fix_coder works from that, and produces correct patches.
GitLab's get_repository_file tool accepts a ref parameter. That one parameter is what makes the before/after context possible. It is a small API detail that turned out to matter a lot.
What the output looks like
A typical BCGuard comment on a merge request looks like this:
## BCGuard Analysis
**Verdict: BREAKING CHANGES DETECTED**
---
### HIGH (3 findings)
**1. getUserById — signature changed**
File: `src/services/user.js`
What changed: Parameter `includeDeleted: boolean` added without default value
Consumers affected:
- `order-service/src/api.js` (line 47)
- `payment-service/src/handler.js` (line 112)
- `notification-service/src/jobs.js` (line 23)
**2. /api/v1/users/:id — response shape changed**
File: `src/routes/users.js`
What changed: Field `metadata` removed from response body
Consumers affected:
- `dashboard/src/components/UserCard.jsx` (line 88)
...
---
To generate fixes: `@ai-bcguard-auto-fix-gitlab-ai-hackathon Generate fixes for this MR`
What is next
| Feature | Description | Status |
|---|---|---|
| Cross-repo scanning | Use gitlab_group_project_search to trace consumers across every repo in an org, not just the current one |
Planned |
| Required CI check | BCGuard runs automatically on every MR, not just when @mentioned | Planned |
| Blast-radius weighted severity | A function with 50 consumers is a different risk than one with 1. Score accordingly | Planned |
| Historical pattern detection | A file that has caused breaking changes 3 times gets flagged on the 4th touch | Planned |
| Vertex AI trend reports | Weekly summaries of breaking change patterns across the org | In progress |
The goal is not a smarter linter. The goal is that merging a breaking change becomes as structurally difficult as pushing to main without tests passing.
Every piece of information needed to catch these failures already exists in the codebase. BCGuard is about connecting those dots automatically, before the deploy, every time.
Built With
- agent
- claude
- gitlab
- python
Log in or sign up for Devpost to join the conversation.