Inspiration

Your team uses AI agents every day — code reviews, pipeline fixes, code generation. But nobody can answer: What exactly did your AI agents do? How much did they cost? Are they getting better over time?

DuoLens is the missing governance layer. Inspired by Readout.org — a unified AI agent dashboard — we set out to build the same observability experience natively within GitLab, using nothing but the Duo Agent Platform.

What it does

DuoLens provides three custom flows on the GitLab Duo Agent Platform:

Flow Purpose
🔍 Audit Trail Generates compliance-ready AI Activity Reports — documenting what agents did, files changed, and decisions made
🧠 Knowledge Librarian Extracts patterns from agent sessions and builds institutional memory in AGENTS.md
💰 Cost Analyst Tracks AI credit consumption, forecasts spend, and recommends cost optimizations

All flows are triggered by simply @mentioning a service account on any MR or Issue.

How we built it

  1. Designed flow YAML configs with system prompts for each agent persona
  2. Synced to AI Catalog via CI/CD pipeline with semantic versioning
  3. Iterated on prompts based on real agent behavior — learning that agents need explicit instructions to post output
  4. Tested across 30+ pipeline runs and 17+ agent sessions on real MRs

Tech stack: GitLab Duo Agent Platform, Custom Flows (AI Catalog), Anthropic Claude Sonnet 4, GitLab GraphQL & REST APIs, Python (FastAPI backend), YAML configurations

Challenges we faced

  • API Limitations: aiUsageData GraphQL returns null in the hackathon environment (requires Duo Enterprise). We adapted with activity-based cost estimates.
  • Token Scope Bug: Composite identity tokens lack read_api, causing 403 errors on startup — a known GitLab issue (#592015). Flows still work despite this.
  • Prompt Compliance: Agents sometimes ignored instructions to auto-fetch context. We strengthened prompts with "CRITICAL" directives.
  • Platform Issue: Pulse Reporter flow has a WebSocket issue we couldn't resolve — documented for future work.

What we learned

  1. Use context:goal as primary input — let agents fetch context dynamically
  2. Make "post your findings" a mandatory final step — agents won't post unless explicitly told
  3. Every prompt action needs a matching tool in the toolset
  4. Users prefer transparent approximations over false precision when data is unavailable

What's next

  • 📊 Readout-Style Dashboard — Unified observability UI
  • 🔧 Pulse Reporter Fix — Resolve platform-level issue
  • 📈 Token-Level Cost Tracking — When usage APIs become available
  • 🔄 Scheduled Reports — Automatic weekly reports via pipeline schedules

Built With

  • claude
  • duo-agent
  • fastapi
  • gitlab
  • python
  • yaml
Share this project:

Updates