Inspiration

I work in health tech in Kenya. For the past few years, I've been deploying OpenMRS-based systems across 50+ counties. Over 100,000 patients depend on these systems daily.

Every time we ship a release, there's this moment where you wonder: did anyone log a patient's MRN somewhere they shouldn't have? Is that new FHIR endpoint actually checking auth? Did the intern hardcode the database password again?

Healthcare breaches are not cheap. We're talking $10.93M average cost per breach. A single HIPAA violation can hit you for $50K to $1.9M. And in Kenya, the Data Protection Act is getting teeth now too.

I've used GitLab's built-in security scanning before. It catches SQL injection and XSS fine. But it has no idea what an MRN is. It doesn't know that logging patient.getFullName() is a HIPAA violation. It can't tell you that sending FHIR resources over HTTP breaks §164.312(e).

That gap between generic security tools and what healthcare teams actually need is why I built HealthGuard AI.

What it does

You mention @ai-healthguard-compliance-flow-gitlab-ai-hackathon on a merge request. It reads every changed file, finds healthcare compliance problems, and posts a structured report right there on the MR.

Not vague stuff like "consider encrypting your data." I'm talking specific findings like: "Line 47 of PatientController.java logs patient MRN to console. This violates HIPAA §164.312(a). Here's the fix: use logger.info("Patient accessed: {}", maskIdentifier(patient.getMrn())) instead."

It checks for 8 categories:

  1. PHI showing up in logs or API responses
  2. FHIR endpoints without OAuth
  3. Missing authentication on patient routes
  4. Unencrypted data transmission
  5. Audit trail gaps
  6. Consent violations and cross-border transfer issues
  7. Hardcoded credentials
  8. Database security problems

Each finding gets mapped to the specific regulation it breaks. Not just "HIPAA" but the exact section. Not just "GDPR" but the specific article. Not just "Kenya DPA" but the clause number.

Critical violations block the merge. You fix them, run it again, it passes. Simple.

Works on Java, Python, TypeScript, Docker configs, whatever healthcare code you're pushing.

Architecture

Here's how the full system works end to end:

Four components make this work:

1. HealthGuard Scanner Agent (Custom Agent in AI Catalog) The brain is the system prompt. It encodes what FHIR Patient resources look like, what OpenMRS REST patterns to watch for, which HIPAA sections map to which code patterns. This knowledge comes from actually deploying these systems across Kenya, not from skimming a compliance PDF. It uses read_file, list_merge_request_diffs, and gitlab_blob_search to analyze every changed file.

2. HealthGuard Compliance Flow (Orchestrated Multi-Step Flow) Two agents chained: Scanner feeds into Reporter. The Scanner identifies violations. The Reporter formats a structured compliance report, posts it as an MR comment, and creates tracking issues for Critical and High severity findings. Triggered automatically when you @mention or assign as reviewer on any MR.

3. External Anthropic Claude Agent Configured with injectGatewayToken: true for GitLab-managed credentials. Anthropic Claude's long context window handles deep whole-file compliance analysis. Good for cases where you need the full picture, not just the diff. The agent runs through GitLab's AI Gateway, which routes requests through Google Cloud Vertex AI infrastructure.

4. CI/CD Compliance Gate (Green Agent) A pipeline stage that runs lightweight grep-based scanning for 7 violation categories before the AI agent even spins up. It catches the obvious stuff fast: hardcoded credentials, PHI in logs, missing TLS. If critical violations are found, it exits with code 1 and blocks the merge immediately. This saves CI/CD compute minutes by catching low-hanging fruit without burning AI model tokens. Only the complex, contextual analysis gets sent to the AI agent. In testing, this gate catches ~40% of violations on its own, meaning ~40% fewer AI requests and ~40% less pipeline compute.

How we built it

No external servers. No custom APIs. Everything is YAML configs and a system prompt.

The agent lives in agents/agent.yml. The brain is the system prompt, and that's where all the healthcare compliance knowledge goes. What FHIR Patient resources look like. What OpenMRS REST patterns to watch for. Which HIPAA sections map to which code patterns. This stuff comes from years of deploying these systems in the field.

The flow is in flows/flow.yml. Two steps chained together with routers. Scanner reads the code and finds violations. Reporter takes those findings and acts on them: posting reports and creating issues.

I also set up an external Anthropic Claude agent config at .gitlab/duo/flows/claude.yaml. Setting injectGatewayToken: true means it uses GitLab-managed credentials through the AI Gateway, which routes through Google Cloud Vertex AI. No API keys to manage, no secrets to rotate.

The CI/CD compliance gate in .gitlab-ci.yml runs as a pipeline stage. It uses grep patterns to scan for common healthcare violations before the AI agent processes anything. This is the Green Agent angle: catch the obvious stuff cheaply, save the expensive AI compute for what actually needs it.

For the demo, I wrote realistic healthcare code with real violations baked in:

  • A Java PatientController that exposes FHIR endpoints without auth
  • A LabResultService that logs patient SSNs to the console
  • A Python API with hardcoded OpenMRS credentials and no consent checks for cross-border transfers
  • A React PatientDashboard that puts MRNs in URL params
  • A Docker Compose with unencrypted database connections

All of this is stuff I've actually seen in real codebases over the years.

Published everything to the AI Catalog with a git tag. Flow triggers are handled automatically by GitLab.

Challenges we ran into

First wall: I spent time thinking I needed Maintainer access to set up flow triggers. The docs literally say "at least the Maintainer role" for creating triggers. Turns out GitLab handles trigger creation automatically when you publish a flow through a tag. Nobody tells you that until you ask on Discord.

Second wall: the hackathon group has a pipeline execution policy that overrides custom .gitlab-ci.yml files. So my CI/CD compliance gate is in the repo and works, but the hackathon's built-in pipeline runs instead of mine. Not ideal for the demo, but the code is right there for judges to review.

Third wall: flow sessions were failing even though jobs passed. WebSocket would connect, then close immediately with no error message. Other participants hit the exact same issue. I escalated to Lee Tickett on Discord and he's been helping us get unblocked.

The real work was the system prompt. My first version flagged everything and anything. Second version was too conservative and missed obvious violations. Getting the balance right took several rounds of testing against the demo files. The agent now finds 19 distinct violations with correct regulatory citations. That didn't happen on the first try.

Accomplishments that we're proud of

19 violations found on first real scan. Not 19 vague warnings. 19 specific findings with file names, line numbers, severity levels, regulatory citations, and working fix code for each one.

I built a custom agent AND an orchestrated multi-step flow. I looked at GitLab's Prompt Library and only 1 out of 111 prompts is "Advanced" orchestration level. HealthGuard AI plays in that space.

The agent catches things generic tools miss entirely:

  • It knows that logger.info("Processing patient: " + patient.getName()) is a PHI exposure
  • It knows that http://fhir-server:8080/Patient/$everything without OAuth is a critical auth gap
  • It knows that sending lab results to partner-lab.co.ke without checking Kenya DPA §48 consent is a cross-border violation

That's not just prompt engineering. That's domain knowledge from deploying these systems across 50+ counties.

Multi-language out of the box. Same agent scans Java, Python, TypeScript, Docker Compose, and YAML configs. No per-language configuration needed.

The CI/CD gate catches ~40% of violations before the AI agent even runs, saving compute minutes and model tokens on every scan. Green agent thinking built in from day one.

What we learned

System prompts are the product. The YAML config is maybe 50 lines total. The system prompt is where the real intelligence lives. Healthcare compliance has enough nuance that you can't just tell an agent "check for security issues." You have to teach it what an MRN looks like, why FHIR Bundle scoping matters, what the difference is between HIPAA §164.312(a) access controls and §164.312(e) transmission security.

GitLab's Agent Platform is more powerful than it looks at first glance. The flow system with automatic triggers, service accounts, and the scanner-reporter chain pattern is solid once you understand how the pieces connect. And the fact that it routes through Google Cloud Vertex AI means you get enterprise-grade infrastructure without setting up anything yourself.

The injectGatewayToken pattern for external agents is clean. One line in YAML and Anthropic Claude is integrated through GitLab-managed credentials. No API key management, no rotation headaches. That's how integrations should work.

Discord is essential for hackathons. When I hit the role access wall, Lee Tickett responded in minutes with the answer. Don't sit stuck for hours when someone in the community can unblock you in seconds.

And the biggest takeaway: healthcare compliance is genuinely underserved in DevSecOps. The gap between what tools like SAST and Semgrep catch versus what healthcare teams actually need is enormous. This is a real problem that's been waiting for a solution.

What's next for HealthGuard AI

Next up: more detection rules for ICD-10 diagnosis codes, medication data (NCPDP standards), and medical device identifiers. More regulatory frameworks like PIPEDA for Canada and LGPD for Brazil.

I want to add a compliance dashboard on GitLab Pages showing scores trending over time per project. And deeper Google Cloud integration using Vertex AI for specialized medical NLP models that can better identify clinical terminology in code comments and string literals.

The long term goal is to publish this to the broader AI Catalog so any healthcare team on GitLab can enable it with one click. There are thousands of health tech teams pushing code every day with zero automated compliance checks. That needs to change.

Built With

Share this project:

Updates

posted an update

HealthGuard AI is live and submitted for the GitLab AI Hackathon.

28 violations caught on first real scan across Java, Python, TypeScript, and Docker — 15 Critical, 8 High, 5 Medium. Compliance score: 0/100. Merge blocked. 17 tracking issues auto-created with HIPAA, GDPR, and Kenya DPA labels.

What makes it different from generic SAST tools: it knows that logger.info("Processing patient: " + patient.getName()) is a PHI exposure. It knows that a FHIR $everything endpoint without OAuth scope check breaks HIPAA §164.312(a)(2)(i). It knows that sending lab results to a partner lab without checking Kenya DPA §48 is a cross-border consent violation. That knowledge comes from actually deploying these systems across 50+ Kenyan counties — not from skimming a compliance PDF.

Architecture: custom Scanner agent feeds into a Reporter agent via orchestrated multi-step flow. Reporter posts the full findings report on the MR and creates labeled tracking issues automatically. Anthropic Claude handles deep whole-file analysis via GitLab AI Gateway with injectGatewayToken. A CI/CD compliance gate runs grep-based pre-screening before any AI model touches the code — catching ~40% of violations cheaply and saving compute on every scan.

Next: ICD-10 and medication data detection rules, PIPEDA and LGPD coverage, and a compliance score dashboard on GitLab Pages.

Healthcare teams are pushing code every day with zero automated compliance checks. That needs to change.

Log in or sign up for Devpost to join the conversation.