Inspiration

I've been building a crypto trading agent for a while now — multi-model AI brain, risk management, the whole thing. every time I pushed code I'd catch myself manually scanning for leaked API keys, PII in logs, stuff that could blow up in prod. it's the kind of thing you always mean to do but nobody has time for on every single MR.

so I thought: what if every merge request got a full security review automatically? not just linting, but actual threat modeling, GDPR compliance, secret detection — and it doesn't just complain, it opens fix MRs for you.

What it does

Aegis is a multi-agent GitLab Flow that reviews every merge request for security and privacy issues. four specialized agents run in parallel:

  1. Privacy Agent — catches PII in logs, API responses, and URLs. checks GDPR compliance (Articles 5, 6, 7, 25, 32)

  2. Threat Model Agent — runs STRIDE analysis on code changes, flags SQL injection, unauthed endpoints, file upload risks

  3. Secret Sentinel — detects leaked credentials via pattern matching + Shannon entropy analysis, then auto-generates a fix MR replacing secrets with env vars + rotation instructions

  4. Report Synthesizer — deduplicates and combines everything into one structured MR comment with severity scoring you add Aegis to your project, push an MR, and get a full security review posted as a comment.

19 issues caught on our demo MR — 3 Critical, 15 High, 1 Medium. zero config needed.

How we built it

designed the four-agent architecture and built it in a single session with Claude. bottom-up approach:

  • Python analysis toolkit first — AST-based PII scanner, GDPR rule engine, STRIDE threat analyzer, secret detector with Shannon entropy, auto-fixers that generate clean replacement code

  • CLI layer — so everything runs locally too (aegis scan-all --code app.py)

  • GitLab Duo agent configs — each Python tool wired into an external agent with Claude as the reasoning layer for contextual analysis and false positive reduction

  • Flow orchestration — privacy + threat + secrets scan in parallel, synthesizer waits for all three, posts the combined report as an MR comment

  • 56 tests covering detection, fixers, patterns, and report formatting

the whole toolkit is pure stdlib (ast, re) plus unidiff for diff parsing. no paid external APIs ; Claude access comes through GitLab's AI Gateway.

Challenges we ran into

false positive tuning — the PII scanner kept flagging test fixtures and example emails as real leaks. had to build contextual heuristics so it understands the difference between test@example.com in a test file and an actual email being logged in production code.

auto-fix generation — replacing a hardcoded secret sounds simple until you account for different assignment styles (string literals, dict values, f-strings, multi-line). the fixer needs to preserve indentation, generate the right env var name, and update .env.example — all without breaking the code.

STRIDE mapping — translating raw code patterns into meaningful threat categories required careful thought about what's actually dangerous vs. normal code. a DELETE endpoint isn't always a risk — but a DELETE endpoint without auth checks is.

Accomplishments that we're proud of

  • 19 real findings on the first scan of our demo MR — 3 Critical secrets, SQL injection, unauthed endpoints, PII in logs, GDPR violations. it catches the stuff that actually gets people breached

  • 56 tests passing in 0.06s — fast, reliable, no flaky tests

  • the report format — structured markdown with severity scoring, specific line numbers, rotation instructions for each secret type, and GDPR article references. it's actionable, not just a wall of warnings

  • zero external dependencies for the core scanner — pure Python stdlib. installs in seconds,
    runs anywhere

What we learned

the multi-agent architecture — even running locally — proved that parallel scanning followed by synthesis is the right model for security review. each scanner is focused and fast on its own, but the combined report is where the real value is. designing for the Duo Agent Platform flow model forced us to think about agent boundaries and data handoff cleanly, which made the whole system better even outside of GitLab.

What's next for Aegis

  • more detection patterns — OWASP top 10 coverage, more secret types (cloud provider keys, SSH keys), more GDPR articles

  • CI/CD pipeline integration — block merges on Critical findings, require sign-off on High

  • custom rule definitions — let teams add their own compliance rules (HIPAA, SOC2, PCI-DSS)

  • learning from feedback — track which findings get dismissed vs. acted on, reduce false positives over time

Built With

  • ast
  • claude
  • claude-code
  • gitlab
  • phyton
  • regex
  • unidiff
Share this project:

Updates