-
-
Developer mentions HIPAA-Guard in the MR. The flow activates instantly.
-
Agent 5 posts the full compliance report directly on MR !12.
-
Three CRITICAL violations with exact file locations, rules, and fixes.
-
Agent 4 confirms all violations resolved. Score improved from 0 to 95.
-
Live view of all 5 agents running in sequence with structured output.
-
Agent 2 produces structured classification with 45 CFR citations per finding.
-
Agent 3 reads the exact file, applies minimal fixes, and pushes to duo/fix/hipaa-guard-12.
-
HIPAA-Guard opened MR !13 with a complete remediation description, violations resolved list, and follow-up instructions.
-
Pipeline passed. MR is ready to merge with one click.
-
The Advisor appears in the GitLab Duo Chat agent selector alongside other agents.
-
Developer asks why their MR was blocked. Claude explains with citations inside the MR view.
-
Companion Flask app running at http://localhost:5001.
-
Paste code and click Run Demo Scan — instant compliance report.
-
Each finding card shows severity, why it matters, developer impact, safer action, and flagged code.
Inspiration
While working on a healthcare AI project, I kept asking myself a question that genuinely unsettled me, what actually prevents a developer from accidentally pushing a patient's name, SSN, or diagnosis into a log file or an API response? I looked around and realized the answer was almost nothing. Existing tools catch syntax errors and security vulnerabilities, but none of them truly reason about HIPAA compliance. In healthcare software, that blind spot is not just a bug. It is a liability. That frustration is what sparked HIPAA-Guard.
What it does
HIPAA-Guard is an AI enforcement layer that lives inside your GitLab CI pipeline. Every time a developer opens a merge request, it automatically scans the code diff for Protected Health Information (PHI) risk. It does not just flag a line and move on. It uses Claude to explain why that code is a violation in plain language, generates a compliant replacement, verifies whether the fix actually resolves the issue, and blocks the merge entirely if the code remains unsafe. Think of it as a HIPAA-aware code reviewer that never sleeps and never misses context.
How we built it
The core is built around GitLab's Merge Request API, which we hooked into a CI job that triggers on every merge request. The diff is extracted and sent to Claude via the Anthropic API with a carefully engineered prompt that gives it full HIPAA context, including what counts as PHI, what safe handling looks like, and what a compliant fix should accomplish.
Risk is quantified using a weighted signal model:
$$R = \sum_{i=1}^{n} w_i \cdot \phi_i$$
where φᵢ represents each detected PHI signal such as name, SSN, date of birth, or diagnosis, and wᵢ is its corresponding HIPAA severity weight. If R exceeds a defined threshold, the pipeline fails and the merge is blocked.Fix generation runs as a second Claude pass, taking the violation context and producing drop-in compliant code. A third verification pass then confirms the fix fully resolves the issue before the pipeline approves the merge.
Challenges we ran into
The hardest problem was precision. Early versions were overly aggressive,
flagging variable names like patient_id even when no actual PHI was
present in the data. Getting Claude to distinguish between structural PHI
risk and actual PHI exposure required significant prompt iteration and
testing across a wide range of real-world code patterns. We also had to
keep latency reasonable. A compliance check that takes 45 seconds kills
developer workflow, so we spent considerable time optimizing our diff
extraction and prompt structure to stay within acceptable CI runtime
limits without sacrificing accuracy.
Accomplishments that we're proud of
The full end-to-end pipeline works, and that alone felt like a significant win. Detect, explain, fix, verify, block. Most tools in this space stop at detection and leave the developer to figure out the rest. Getting the fix generation to produce code that is genuinely drop-in ready rather than just advisory was the milestone I am most proud of. It transforms HIPAA-Guard from a blocker into a collaborator, something that actually helps developers move faster rather than slowing them down.
What we learned
LLMs are remarkably effective compliance reasoners when given the right context. Rule-based scanners and regex patterns will always lose to developers who rename variables or restructure their data flows. Claude does not care about variable names. It understands intent, and that makes all the difference when detecting nuanced PHI risk buried inside otherwise clean-looking code.
What's next for HIPAA-Guard
The immediate next step is expanding beyond GitLab to support GitHub and Bitbucket pipelines. Beyond that, we want to add support for additional regulatory frameworks including SOC 2, GDPR, and 21 CFR Part 11 for clinical software. The longer-term vision is a compliance dashboard that gives engineering leads a full audit trail of PHI risk decisions across every merge request in their organization, making compliance not just enforceable but visible.
Built With
- claude
- css
- environment-variables
- flask
- gitlab-ai-catalog-mapping
- gitlab-ci/cd
- gitlab-duo-agent-platform
- gitlab-mcp-tools
- gitlab-merge-requests
- html
- javascript
- python
- rest-style-api-patterns
- yaml

Log in or sign up for Devpost to join the conversation.