SecureFlow AI

Inspiration

Security is one of the biggest bottlenecks in software delivery. Teams discover vulnerabilities too late, secrets leak into repositories, and remediation guidance is scattered across tools and comments. Manual security reviews are inconsistent, slow, and don't scale.

"The cost of fixing a vulnerability grows exponentially the later it is found in the SDLC."

I wanted to build an AI teammate that fits directly into GitLab workflows — one that doesn't just report problems, but actually fixes them.


What it does

SecureFlow AI is a multi-agent security orchestration platform on the GitLab Duo Agent Platform. With a single issue comment trigger, it:

  • 🔍 Scans repositories for vulnerabilities, exposed secrets, dependency risks, and IaC/CI/CD misconfigurations across $15+$ languages and ecosystems.
  • ⚖️ Triages findings with policy-based gating (strict, balanced, lenient) and suppresses accepted-risk fingerprints from a baseline file.
  • 🔧 Generates patch-ready autofix candidates for the highest-impact findings.
  • 🚀 Auto-remediates by applying fixes, committing code, and opening draft merge requests — no human coding required.
  • 📊 Reports a structured compliance report with trend analysis (comparing against previous scans), auto-labels the issue with severity tags, and creates tracked vulnerability issues for critical findings.
  • 🌱 Tracks green metrics (files scanned, files skipped, tool calls, scan strategy) for resource-efficient AI usage.

Confidence & Exploitability Scoring

Every finding is scored with:

$$\text{confidence} \in [0.00, 1.00], \quad \text{exploitability} \in {1, 2, 3, 4, 5}$$

Policy gate thresholds vary by mode:

Mode Fail Condition
strict Any HIGH/CRITICAL with $\text{confidence} \geq 0.70$
balanced Any CRITICAL or $\geq 2$ HIGH with $\text{confidence} \geq 0.80$
lenient CRITICAL/HIGH with $\text{confidence} \geq 0.90$ and $\text{exploitability} \geq 4$

How we built it

SecureFlow AI is built using:

  • A custom agent.yml defining core security behavior, scoring logic, IaC-specific checklists, and green metrics tracking.
  • A custom flow.yml with a 5-agent orchestrated pipeline:
security_scanner → secret_detector → fix_generator → auto_remediator → compliance_reporter
# Agent Role Key Tools
1 security_scanner Deterministic grep + AI vulnerability detection + vuln DB cross-ref read_file, grep, list_vulnerabilities, list_security_findings
2 secret_detector Risk triage, deduplication, baseline suppression, policy gate read_file, grep, list_vulnerabilities
3 fix_generator Patch-ready autofix candidates with MR metadata read_file, read_files
4 auto_remediator Applies fixes, commits code, opens draft MRs edit_file, create_commit, create_merge_request
5 compliance_reporter Trend analysis, structured report, auto-labeling, vuln issues list_issue_notes, create_issue_note, update_issue

Total platform tools used: $15+$ (vs. typical submissions using $2$–$5$)

All five agents run on Anthropic's Claude models through the GitLab Duo Agent Platform, leveraging Claude's structured output and function-calling capabilities.

Supporting Files

File Purpose
AGENTS.md Industry-standard agent behavior configuration
.secureflow-baseline.json Accepted-risk suppression across repeated scans
.gitlab/issue_templates/security_scan.md Pre-filled trigger template for easy onboarding
CONTRIBUTING.md Contribution guidelines for the project

Challenges we ran into

  • Catalog schema validation required exact tool name matching against GitLab's strict enum allowlist — tool names like write_file needed to be edit_file, and create_commits needed to be create_commit.
  • Service-account and flow-handle sync issues during tag updates and catalog publishing.
  • Payload size management between flow stages — balancing thorough detection with runtime limits.
  • Prompt contract reliability — designing structured JSON outputs across all five agents without breaking the chain.
  • Noise reduction — solved with deterministic pre-checks (grep) before LLM reasoning, capping findings at $n = 20$.

Accomplishments that we're proud of

  • ✅ Built a working 5-agent auto-remediation pipeline: scan → triage → fix → commit → merge request in a single trigger.
  • ✅ Implemented real draft merge request creation — the agent writes the code and opens the MR.
  • ✅ Added trend analysis by reading previous report comments and computing severity deltas: $\Delta = \text{current} - \text{previous}$.
  • ✅ Implemented auto-labeling of issues (secureflow::pass/warn/fail, severity::critical/high).
  • ✅ Added green metrics tracking for resource-efficient AI usage.
  • ✅ Expanded detection across $15+$ languages, dependency manifests, CI/CD, Docker, Terraform, and Kubernetes.
  • ✅ Used $15+$ GitLab Duo platform tools — deep integration, not surface-level.

What we learned

  • Prompt contracts and strict structured JSON outputs are critical for reliable multi-agent orchestration — one malformed output breaks the entire chain.
  • Deterministic checks (grep patterns) before LLM reasoning dramatically improve precision, consistency, and resource efficiency.
  • Good security automation needs triage and prioritization, not just raw findings — developers ignore noisy reports.
  • Developer experience matters: one-trigger workflows, clear reports, and auto-created MRs drive real adoption.
  • The GitLab Duo tool enum is strict — always validate tool names against the schema before publishing to the AI Catalog.
  • Green AI design (skipping binaries, capping findings, sequential pipelines) is both environmentally responsible and practically faster.

What's next for SecureFlow AI

  • 📋 Add configurable org-level policy profiles and per-project overrides for enterprise teams.
  • 🔄 Add richer suppression lifecycle (expiry dates, ownership assignment, review reminders).
  • 📈 Add severity drift tracking and risk burn-down charts across multiple scan runs.
  • 🔧 Expand auto-remediation to support multi-file fixes and dependency upgrade MRs.
  • 🏢 Evolve from hackathon prototype to a production-ready security teammate for GitLab teams.

Built With

Share this project:

Updates