Inspiration
Developer Experience (DX) is often discussed but rarely measured. Teams know when things "feel slow" - CI takes too long, reviews pile up, context switching kills focus -but they lack a systematic way to quantify these pain points and prioritize fixes. Meanwhile, DORA metrics exist in dashboards that few people check, and burnout creeps in without anyone noticing the patterns.
I was inspired by three ideas: (1) the DX framework from Abi Noda and Margaret-Anne Storey that breaks developer productivity into Feedback Loops, Flow State, and Cognitive Load; (2) the DORA research program that benchmarks delivery performance; and (3) the growing need for "Green AI" - understanding the environmental cost of AI tool usage. I wanted to bring all three together inside GitLab itself, where the data already lives, so teams can get actionable insights without leaving their workflow.
What it does
DX Insights Agent is a multi-agent system built on the GitLab Duo Agent Platform that analyzes Developer Experience across an entire GitLab group. Given a group path, it:
- Collects metrics across 30/60/90-day windows from MRs, pipelines, commits, issues, and security findings using 25 GitLab API tools
- Classifies projects as Software Engineering or AI Engineering and tailors analysis accordingly
- Scores three DX dimensions: Feedback Loops (CI speed, review latency, DORA benchmarking), Flow State (context switching, deep work patterns, per-developer burnout risk), and Cognitive Load (MR complexity, tech sprawl, onboarding difficulty)
- Builds per-developer profiles with flow scores, cognitive load levels, and burnout risk indicators
- Computes historical trends (IMPROVING / STABLE / DECLINING) across all time windows
- Generates a Green AI efficiency score for teams using AI agents and flows
- Commits a structured JSON report to a target repository and opens a Merge Request - making reports reviewable, versioned, and pipeline-ready
- Outputs a visual scorecard with scores, bar charts, developer health, prioritized pitfalls, and a 30-day outlook with and without intervention
A lightweight DX Quick Scan agent provides a rapid single-project health check with top 3 quick wins in about 2 minutes.
How I built it
I built 7 agents and 1 orchestrator flow using the GitLab Duo Agent Platform YAML schema:
- DX Insights All-in-One - a single agent that runs the full pipeline (collect → analyze → commit → report) in one Duo Chat session, using 25 GitLab API tools including
create_commitandcreate_merge_request - DX Insights Flow - a 5-agent orchestrated pipeline: Data Collector → 3 parallel analyzers (Feedback Loops, Flow State, Cognitive Load) → DX Advisor, for environments supporting flow invocation
- DX Quick Scan - a standalone lightweight agent for single-project health checks
Key technical decisions:
- Multi-window analysis (30/60/90 days) to detect trends rather than giving a single snapshot
- Per-developer attribution threaded through the entire pipeline so recommendations are personalized
- Cross-repo report commits - reports can be saved to any target repository, enabling a separation between the analyzed group and the reporting infrastructure
- Companion GCP pipeline - a separate repository handles CI-driven ingestion of JSON reports into Cloud Functions, BigQuery, and Looker Studio for long-term dashboarding
- Prompt engineering under constraints - system prompts were compressed to stay under the ~7,500 character safe limit while maintaining analytical depth
All agents are public: true and use only native GitLab Duo tools — no external APIs or custom code.
Challenges I ran into
- YAML schema discovery - the Duo Agent Platform schema has undocumented constraints (e.g.,
toolsetin flows must be plain strings not objects, component names must be alphanumeric + underscore only, description max 1024 chars). I hit validation errors repeatedly and had to reverse-engineer the rules through trial and error. - Prompt size limits - the All-in-One agent needs to encode collection logic, three analysis frameworks, JSON schema, MR creation, and visual formatting into a single system prompt under ~7,500 characters. This required aggressive compression and careful prioritization of instructions.
- DORA metric approximation - GitLab's API doesn't directly expose DORA metrics via the tools available to Duo agents, so I had to compute them from raw data (merges to default branch as deployment proxy, pipeline failure rates, recovery times from consecutive pipeline runs).
- Burnout detection framing - I wanted per-developer health signals but needed to frame them as supportive rather than surveillance. I spent time ensuring the language emphasizes "support" and "watch" rather than "flag" or "monitor."
- Flow invocation uncertainty - Duo Chat may not yet support flow invocation, so I built the All-in-One agent as a parallel path that delivers the same functionality in a single agent call.
Accomplishments that I'm proud of
- 7 agents + 1 flow - a comprehensive multi-agent system, not just a single chatbot prompt
- End-to-end automation - from raw GitLab API data to a committed JSON report with an open MR, all within a single Duo Chat conversation
- Per-developer burnout detection - going beyond team averages to identify individuals who may need support, using activity pattern signals (high WIP, erratic commit times, project fragmentation)
- Green AI scoring - a novel dimension that evaluates how efficiently teams use AI/LLM resources and estimates CO2 impact, with actionable recommendations to reduce waste
- Historical trend analysis - not just a snapshot but a trajectory, showing whether things are getting better or worse across 30/60/90-day windows with a projected 30-day outlook
- Cross-repo report pipeline - reports are machine-readable JSON committed via MR, enabling downstream automation (BigQuery ingestion, Looker dashboards, alerting) without any custom infrastructure in the analyzed group
- Solo build - designed, built, and iterated the entire system alone, from architecture to prompt engineering to documentation
What I learned
- The Duo Agent Platform is powerful but young - it can orchestrate complex multi-step workflows with real GitLab API tools, but documentation gaps and schema quirks require patience and experimentation
- Prompt engineering is a real engineering discipline - fitting a 4-phase analytical pipeline into a single ~7,500 character system prompt requires the same rigor as writing tight, well-factored code
- DX metrics are deeply interconnected - improving CI speed (Feedback Loops) directly reduces context switching (Flow State) which lowers cognitive overhead (Cognitive Load). The framework reinforces itself.
- Per-developer data changes the conversation - team averages hide the real story. When you can show that one developer is juggling 6 concurrent MRs across 4 projects while another has a clean focus window, the recommendations become specific and actionable.
- AI sustainability matters now - even small optimizations (removing unused tools from agent configs, narrowing context windows) can significantly reduce token usage and associated environmental cost
What's next for DX Insights Agent
- Real-world validation - run the agent across diverse GitLab groups (small teams, large orgs, open source projects) and iterate on scoring calibration and prompt accuracy
- Scheduled analysis - trigger DX reports automatically via GitLab CI schedules (weekly/monthly) so teams get continuous visibility without manual invocation
- Looker Studio dashboards - the companion GCP pipeline already ingests reports into BigQuery; next step is building polished dashboards for historical DX trends across quarters
- Team comparison - benchmark DX scores across multiple groups within an organization, identifying teams that are thriving and teams that need support
- Custom dimension weights - let teams configure which DX dimensions matter most to them (e.g., a platform team may weight Cognitive Load higher than Flow State)
- GitLab issue integration - automatically create issues for critical pitfalls and assign them to relevant developers, closing the loop from insight to action
- Expanded AI Engineering analysis - deeper integration with Duo usage metrics as they become available, including suggestion acceptance rates, code generation quality, and AI-assisted review coverage
Built With
- agentic
- ai
- bigquery
- cloudrun
- duo
- gcp
- gitlab
- kestra
- multiagent
- python


Log in or sign up for Devpost to join the conversation.