Orbit Sentinel — Autonomous Engineering Digital Twin

[!IMPORTANT]

🧑‍⚖️ Quick Links for Judges

  • Live Demo: orbit-sentinel.vercel.app
  • Judge's Tour: orbit-sentinel.vercel.app/?judge=trueCrucial: First-time visitors are guided by a Setup Wizard. To skip onboarding and run the automated step-by-step demo of all 8 views, open this link. Press Space to toggle auto-play, or use Left/Right arrow keys to navigate.
  • Analyze Your Own MR: Paste any public GitLab MR URL on the Overview page and click "Analyze Live". Private repos require a GitLab PAT with read_api scope (add via the token input that appears).

✅ Proof of Live Execution

Validated across 4 merge requests via the GitLab Duo Agent Platform and live dashboard — not mocks, not a single demo run. (Note: Session references are located inside our workspace repository at orbit-sentinel/docs/orbit-traversal-results.md to avoid external authentication barriers).

MR Summary What Was Proven
!10 Duo Session 10 All 4 Orbit query types executed; report posted as MR note. Ecosystem: 132,059 pipelines (17.8% failure rate), 50+ historical MRs (90% abandonment rate)
!12 Live Duo Run 51 nodes / 29 edges confirmed live — report auto-posted to MR
!3 Duo Session 3 Full graph traversal: 22 nodes / 40 relationships, 7 distinct risk signals detected
!5 Live Dashboard and GitLab MR !5 analysis Real-time dashboard verification - 224 nodes / 189 edges discovered live. Orbit Sentinel correctly scored 8% LOW risk (Draft MR, 1 file, no pipeline). Analyzed MR !5 live. Forecast Engine predicted MR closure within 7 days. Closed-loop tracking active - outcome will be verified in Predictions Tracker.

→ Full traversal log: orbit-traversal-results.md

🔮 Real Prediction Verified: MR !5 Closed-Loop Accuracy

Orbit Sentinel analyzed MR !5 (Draft: Test sentinel) live on June 24, 2026. Most prediction matched reality:

Prediction Actual Outcome Result
Risk score: 8% LOW Draft MR, 1 file changed, no reviewers ✅ Correct
0 historical matches on branch Brand-new test-sentinel branch, never merged ✅ Correct
No pipeline detected → PATH BROKEN GitLab shows 0 pipelines on this MR ✅ Correct
Orbit graph: 224 nodes / 189 edges Live Orbit API response on real project ✅ Real data

This is the closed loop in action: Predict → Ship → Verify → Learn. The Forecast Engine predicted the MR would close. It closed. That is a verified True Positive in the Predictions Tracker.

Inspiration

AI predicts code. Orbit Sentinel predicts consequences.

You hit merge. Then you hold your breath. 🔮 Will this break something? Did I miss a downstream dependency? Has this file failed before? Is there a rollback plan?

Code review catches logic errors. It cannot trace blast radius through a dependency graph. It cannot find the MR from six months ago that silently caused an incident on the same file. Engineers spend hours manually checking deployment chains, digging through incident history, and guessing at risk — because no digital twin exists to simulate the change before it lands.

GitLab Orbit exposes the entire SDLC as a queryable knowledge graph. But Orbit is an API, not a tool. Someone has to turn those four query types into something a developer can use without learning GraphQL.

Orbit Sentinel is that tool. 🛰️

What it does

Paste any GitLab MR URL. Orbit Sentinel builds a living digital twin of the affected system - then delivers an 8-view interactive dashboard and an autonomous MR note, all from the same analysis:

View What It Shows Orbit Query
🖥️ Dashboard Impact calculator, architecture diagram, evidence panel, incident intelligence, counterfactuals All 4
🎯 Predictions Tracker Accuracy scoreboard, post-merge verification, vulnerability-adjusted forecasts Closed-loop
💥 Blast Radius Interactive dependency explorer - click any node to drill into connections NEIGHBORS
🔍 Risk Investigation 5-dimension risk breakdown, pipeline failure heatmap, correlation coefficient AGGREGATION
🧪 Forecast Engine What-if simulator — toggle variables, watch risk animate in real-time Simulation
📜 Historical Context Past incidents ranked by Jaccard similarity - "has this failed before?" TRAVERSAL
📋 Impact Report Full MR comment with one-click Markdown copy or JSON export All 4
Setup Wizard 4-step guided journey — first-time visitors see this, not a blank graph

The autonomous flow 🤖 triggers automatically on MR open or commit push - executes all 4 Orbit queries, composes a 9-section report (executive summary, blast radius, failure predictions, historical context, reviewer recommendations, rollback plan, test plan, remediation steps), and posts it as an MR note.

The closed loop 🔄 - Every prediction is tracked post-merge through a 7-day survival window. The Predictions Tracker computes accuracy rate, shows risk trend charts, and surfaces vulnerability-adjusted forecasts. Orbit Sentinel doesn't just predict - it proves its predictions should be considered.

How we built it

Architecture Flow:

  1. 🦊 GitLab MR (Trigger event)
  2. ⚙️ Engine (Node.js/TypeScript, 105 tests) — processes logic, mitigations, and incident matching
  3. 🎨 Visualizer (React/D3, 30 tests) — renders the 8 interactive views and dashboard
  4. 🛰️ GitLab Orbit Knowledge Graph (Connected to both Engine & Visualizer) (Graph Scale: 23 nodes / 43 edges baseline, scaling to 224 nodes / 189 edges live peak)

Engine ⚙️ — TypeScript, Express, custom validation, 8 classified error types with exponential backoff:

  • DigitalTwinBuilder — orchestrates all 4 Orbit queries, merges into unified graph via orbitOrFallback()
  • Grep Fallback — when Orbit is unavailable, fetches files via GitLab API, parses imports, builds dependency graph without Orbit. Fast-paths to empty when no token present (no hanging)
  • Risk Engine — 5-factor scoring (predictions, blast radius, incidents, pipeline health, reviewer coverage) from Orbit evidence
  • Memory Store — Jaccard similarity engine for historical incident matching
  • Remediation Planner — rollback strategies, test plans, remediation steps by risk priority
  • Rate-limited to 28 queries per MR (was 107 — 74% reduction via dedup + file cap)

Visualizer 🎨 — React 18, D3.js, Vite, 40 components, zero CSS files:

  • 8 interactive views, responsive across 3 breakpoints (360px to 768px)
  • Design token system: centralized colors, z-index tiers, animation presets, spacing scale
  • Judge's Tour (?judge=true) 🧑‍⚖️ — guided walkthrough, Space for auto-demo, ← → / 1-8 to navigate
  • Theme toggle (light/dark) persisted to localStorage �-
  • 30 component tests (27 component + 3 app) covering DataModeBanner, PredictionsTracker, OrbitQueryInspector, DigitalTwinGraph, and all major views

Duo Agent Platform 🔌:

  • Flow YAML at flow/orbit-sentinel-flow.yaml — published to AI Catalog (multiple successful runs, public)
  • Skill at .gitlab/duo/skill.yml (relocated to repository root for auto-detection) — category: code_review, 3 capabilities, agent orchestration
  • MCP config at .gitlab/duo/mcp.json (relocated to repository root for auto-detection) — HTTP connection to Orbit knowledge graph
  • 6 query recipes in skills/orbit-sentinel/recipes/ — ready-to-use JSON for each query pattern

Deployment 🚀 — Vercel (visualizer), Render (engine), Docker Compose (full stack with health checks), CI/CD via .gitlab-ci.yml (8 jobs, 4 stages)

Testing ✅ — 135 tests (105 engine + 30 visualizer) - Orbit client error handling, all 4 query types, similarity engine edge cases, risk thresholds, twin construction, rollback strategies, remediation planning, component rendering

Challenges we ran into

Orbit API 2.1.0 format shift. result.rows became result.nodes/result.edges mid-development. Entire mergeGraph() rewritten. Went from 0 nodes to 14 nodes, 13 edges per MR. Never hardcode GraphQL response shapes.

CORS blocking GitLab file fetches. Browser couldn't fetch changed files from gitlab.com. Built a CORS proxy at /api/probe-mr-files on the engine — now all file fetches route through the server.

Rate limit amplification. Initial code made 107 Orbit queries per MR analysis — throttled to 28 via deduplication and query-type merging (74% reduction). Separately, the per-MR file cap was raised from 5 → 15 for broader coverage, with a transparent UI notice when the cap is hit. These are independent changes: fewer queries per file, more files analysed.

Dark screen on GitLab Pages. Pages access control was Private, requiring auth the static site couldn't provide. Moved to Vercel. Kept Pages config for future use.

Fallback hung without a token. Grep fallback timed out at 30 seconds when no GITLAB_ACCESS_TOKEN was configured. Added fast-path: no token → return empty immediately. Failure mode went from 30s timeout to instant.

Accomplishments that we're proud of

135 tests and 470+ commits. Orbit client retry logic, all 4 query types, similarity engine edge cases, digital twin construction, component rendering state coverage. No coverage theater.

Live Orbit data, not mocks. The engine queries a real GitLab Orbit knowledge graph. The baseline demo graph is calibrated to 23 nodes + 43 edges for optimal visual layout and performance, while our live test runs on MR !5 reached a peak observed scale of 224 nodes + 189 edges across 18 node types. Session #4587076 confirmed live execution: 51 nodes, 29 edges, report posted as MR !12 note.

Closed-loop accuracy. We don't predict and disappear. Every prediction is tracked post-merge with a 7-day survival window. Accuracy rate, risk trends, vulnerability-adjusted forecasts — all surfaced in the Predictions Tracker. This is the difference between a demo and a tool. 🎯

Fallback resilience. When Orbit is unavailable, the engine degrades to grep-based file analysis. The visualizer shows a "Degraded" mode banner. No crash. No blank screen. No "Orbit is down" error.

40 components, 8 views, 3 breakpoints, zero CSS files. Design token system. Theme toggle. Keyboard shortcuts. Judge's Tour. Dynamic risk verdicts. Everything inline-styled. 📱

Flow published to AI Catalog alongside GitLab's official flows - includes a fully interactive visual dashboard.

What we learned

The four Orbit query types are not interchangeable - they're complementary. NEIGHBORS reveals blast radius. PATH_FINDING shows propagation chains. TRAVERSAL surfaces historical patterns. AGGREGATION measures systemic fragility. Combined, they paint a complete picture that no single query can. Using all four is not redundant - it's the whole point.

A digital twin without a fallback is a gamble. When Orbit is available, the twin is rich with graph evidence. When it's not, you still need to ship analysis. The fallback isn't a substitute — it's a safety net that proves the value of the real thing by showing what you lose without it.

Visual feedback changes trust. The same analysis posted as an MR note feels different when rendered as an interactive dashboard. Clicking a node, adjusting a slider, watching accuracy trend over time - these build confidence that text alone cannot provide.

Real Orbit data reveals what mocks cannot. GraphQL response shapes, rate limiting behavior, schema differences between indexed and non-indexed projects - none of these show up in unit tests. Our fallback system exists because we learned this the hard way.

What's next for Orbit Sentinel

  1. Multi-group orbit traversal — Trace dependencies across GitLab groups, not just single projects
  2. Natural language MR briefs — Plain-English summaries for non-technical stakeholders (product managers, support engineers)
  3. VSCode extension — Inline blast radius hints while editing, before the MR exists
  4. Predictive CI/CD — Use AGGREGATION data to predict pipeline failure probability before the pipeline runs
  5. Open source recipe marketplace — Community-contributed query patterns for the Duo Agent Platform

Built With

Share this project:

Updates