ARIA: Autonomous Incident Response Agent

Inspiration

Production incidents are expensive. Teams spend 30–120 minutes diagnosing failures while users see downtime. We built ARIA Gatekeeper to compress that window: pipeline fails → analysis → fix proposed in under 5 minutes, with zero manual log-reading and full compliance audit trails baked in.

What it does

ARIA Gatekeeper is a multi-agent flow that automatically:

Triages pipeline failures in 10 seconds
Analyzes repository + logs (read-only) in 90 seconds
Proposes fixes as draft MRs with full reasoning
Generates SOC2/ISO27001 compliance evidence automatically
Tracks environmental impact (CO₂ footprint of failed deploys)

All MRs require human review. No auto-merge. No touched protected branches.

Proof: <45 seconds from webhook to draft MR. Tested end-to-end.

How I built it

Flow: YAML-based GitLab Agent Platform orchestration (Triage → Reader → Writer agents)
Safety: Separated read and write operations across agents; confidence thresholds (85%+ auto-draft, <85% escalate)

Challenges I ran into

No Maintainer access → Worked around with CI on_failure jobs instead of native webhooks
Signal vs. noise in logs → Agents forced to quote specific error lines before proposing diagnosis
When to auto-execute vs. escalate → Tuned confidence thresholds per error type
Webhook idempotency → Added composite keys (pipeline ID + commit SHA) to prevent duplicate MRs

Accomplishments that I'm proud of

✅ End-to-end automation: Webhook → Draft MR in <45 seconds
✅ Audit-ready by default: SOC2 trails generated automatically
✅ Multi-agent safety: Reader agents are read-only; writers execute only approved steps

What I learned

Agent specialization > monolithic agent — Three focused agents = fewer hallucinations
Confidence thresholds matter more than speed — Safe escalation > fast but risky auto-fixes
Untrusted input = agent sabotage — Strict validation prevents prompt injection from webhook payloads
Compliance is a feature — Teams care about audit trails as much as speed

What's next

Cross-project correlation: Link failures across repos automatically
Feedback loop: Track accepted vs. rejected recommendations; auto-improve prompts

Built With

gitlab-agent-platform
gitlab-ci/cd
yaml

Updates

Mansi Rank started this project — Mar 25, 2026 01:24 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.