π‘ Inspiration
Production incidents are stressful. When a system breaks at 2 AM, SREs and developers spend hours manually tracing Git commit history, recent merge requests, pipeline runs, and ownership charts to figure out what changed, who approved it, and what else was affected. Often, incident postmortems are delayed, incomplete, or skipped entirely because of this manual overhead.
We wanted to build an automated, zero-friction incident autopsy assistant that provides instant, rich context at the exact moment an incident is reported.
π What it does
OrbitPostMortem is a custom GitLab Duo Flow and Skill. When an engineer mentions @ai-orbitpostmortem-incident-autopsy-gitlab-ai-hackathon in any GitLab issue reporting an incident, it automatically triggers:
- Root Cause Analysis: Traverses the GitLab Orbit graph to query all recently merged MRs within the last 48 hours.
- Blast Radius Calculation: Computes the change blast radius, mapping altered files, triggered pipelines, and downstream modules.
- Owner Identification: Traces who authored and reviewed the suspected MRs, as well as who reported the incident.
- Automated Autopsy Output: Synthesizes this information into a beautifully formatted Markdown postmortem comment posted directly onto the issue in seconds.
π οΈ How we built it
- GitLab Duo Flow Platform: We configured a custom
ambientFlow YAML specifying an SRE Analyst component grounded with context inputs (context:goalandcontext:project_id). - GitLab Orbit Skill (
SKILL.md): We developed query recipes for the Orbit Knowledge Graph utilizing graph traversal APIs (query_type: traversal,query_type: neighbors) to map related nodes (Projects, MergeRequests, Users, WorkItems) and relationships (IN_PROJECT,AUTHORED). - GitLab CLI (
glab): Used locally to develop, configure, test, and register the skill definitions.
π§ Challenges we ran into
Aligning the agent flow with the correct automatic service account handles for triggers was tricky initially, but querying direct project members via the GitLab API allowed us to map the precise system bot handle. Understanding and composing correct Orbit Graph traversal queries required referencing the schema, but once configured, the results were incredibly precise.
π Accomplishments that we're proud of
We successfully constructed a fully working custom flow that triggers instantly on mentions. In our tests, it generated a comprehensive postmortem report containing incident timelines, suspected MRs, blast radius files, and action items in under 60 secondsβa task that typically takes an SRE 45 minutes of manual correlation.
π What we learned
We learned the immense power of GitLab Orbit's context graph. Grounding LLM agents in a live property graph of the entire software delivery lifecycle removes the need for brute-force database searches and prevents agent hallucinations.
π What's next for OrbitPostMortem
- Incident Tool Integration: Directly integrate with tools like PagerDuty or Slack to auto-create GitLab issues and trigger the autopsy immediately.
- Rollback Automation: Give the agent permissions to propose code rollback MRs if a suspected MR is identified with high confidence.
Log in or sign up for Devpost to join the conversation.