DevPilot — The GitLab UI-to-Code Agent

D
System Architecture

Inspiration

AI can write code now. That part is no longer surprising.

What still slows teams down is everything around the code: figuring out what is actually broken, reproducing UI issues, mapping visual problems back to source files, preparing safe fixes, routing work into review, and confirming whether the fix truly worked.

That gap inspired DevPilot.

We wanted to build something that feels less like a chatbot and more like a real engineering teammate — one that can inspect a live app, understand what it sees, reason about the repository, prepare a patch, hand work off into GitLab-style workflows, and then verify the result.

The core idea was simple:

Go from “something looks broken” to “here is the issue, here is the patch, and here is whether it was actually fixed.”

What it does

DevPilot is an AI developer teammate that turns runtime issues and repository tasks into structured engineering workflows.

A user can describe:

a UI defect
a repository task
a cleanup request
or a verification goal

Then DevPilot routes that work through a multi-step flow:

UI Inspection
DevPilot opens the target app in a sandbox environment, inspects the interface, captures screenshots, and collects runtime signals.
Analysis
It analyzes what it found and converts visual/runtime problems into structured issue descriptions.
Code Fix Generation
DevPilot maps the issue to likely source files, loads repository context, and prepares a patch proposal.
GitLab Handoff
Approved fixes can move into a repository mutation / merge request workflow.
Verification
After the proposed fix, DevPilot re-checks the app to confirm the issue is resolved and detect regressions.
Background Code Review Discovery
Beyond active tasks, DevPilot can also discover additional review opportunities across repositories — such as UI issues, security concerns, performance problems, testing gaps, and code health tasks — and surface them as actionable review items.

In short, DevPilot combines:

live inspection
repo-aware reasoning
patch preparation
handoff flow
post-fix verification
proactive code review discovery

How we built it

We built DevPilot as a multi-layer system that combines UI orchestration, sandbox execution, structured task state, and agent-style workflows.

Frontend

The product interface is a two-page micro SaaS experience:

a dashboard/intake page
a task workspace

The workspace is split into three core surfaces:

left panel: agent intelligence and workflow messages
center panel: inspection/runtime preview
right panel: patch proposal / diff output

We designed the product to feel like an actual engineering control surface rather than a generic AI chat app.

Local-first orchestration

We used a local-first architecture with structured task and workflow state so the UI remains fast, reactive, and resilient.

This includes state for:

tasks
runs
workflow phases
agent events
patch proposals
verification plans/results
background code review issues

Sandbox runtime

For inspection, we moved to a separate Cloud Run sandbox service instead of keeping everything in the frontend.

That sandbox is designed to support:

repository setup
project root detection
framework detection
package manager detection
install/build/dev command execution
browser automation
live preview architecture
screenshot and runtime artifact capture

AI / agent flow

We modeled DevPilot around specialized responsibilities:

inspection
code reasoning
patch preparation
verification

We also aligned the architecture with a GitLab Duo-style flow model, with phases, agent roles, approval checkpoints, and handoff state.

Repository and review flow

We added structured repository mutation and review logic such as:

patch proposal preparation
merge request creation
pipeline-aware handoff
event-style status tracking

Background discovery

After repository context is loaded, DevPilot can also run a quiet background discovery pass to surface multiple code review issues across repositories, grouped by categories like:

UI
Security
Performance
Code Health
Testing
Cleanup

Challenges we ran into

This project was much harder than “build a pretty AI frontend.”

1. Turning screenshots into engineering tasks

It is one thing to detect that something looks wrong in the UI. It is much harder to:

describe the problem clearly
infer which files are likely involved
propose a safe patch
and keep that whole flow structured enough to review

2. Sandbox reliability

A major challenge was making the sandbox smart enough to handle real repositories.

We ran into issues around:

wrong working directories
missing package.json
skipped dev dependencies
missing package managers like pnpm
framework-specific build differences
repository structure detection in nested and monorepo-like layouts

We had to harden the sandbox bootstrap flow so it could:

detect app roots
detect framework type
detect package manager
install the right tooling
run the right build/dev commands

3. Keeping the UI powerful but clear

There is a fine line between:

“this feels like a real engineering tool” and
“this looks like a cluttered internal dashboard”

We wanted the product to look premium and focused, while still communicating:

runtime inspection
agent intelligence
patch output
review state
verification flow

4. Structured state everywhere

A lot of the difficulty was not visual — it was data modeling.

We had to think carefully about:

workflow phases
patch proposal structures
verification result models
GitLab handoff records
event-driven updates
code review issue generation
background discovery deduplication

5. Making it feel like a teammate, not a chatbot

That was one of the biggest design challenges.

We wanted DevPilot to feel proactive and operational:

not just answering prompts
but actually routing work
discovering issues
preparing actionable fixes
and checking outcomes

Accomplishments that we're proud of

We are especially proud that DevPilot feels like a real AI engineering product, not just a demo with a chat box.

What we’re proud of:

Building a believable UI-to-code workflow
Creating a workspace that clearly shows:
- agent reasoning
- inspection evidence
- patch proposal output
Designing a sandbox-backed inspection architecture
Adding post-fix verification
Modeling the system around multi-agent / flow-based orchestration
Adding background code review discovery that can surface multiple issues across repositories
Keeping the product visually polished while handling a surprisingly complex backend flow
Getting real merge request handoff behavior working far enough to prove the architecture

The biggest accomplishment is that DevPilot already demonstrates a compelling story:

Inspect → Understand → Patch → Review → Verify

That workflow is the real heart of the product.

What we learned

We learned that building an “AI teammate” is much more about systems design than about a single model call.

Some of the most important lessons were:

1. The hard part is orchestration

The model is only one piece. The real challenge is coordinating:

runtime evidence
repository context
structured outputs
handoff flows
verification
memory/state

2. Developer tools need strong structure

If the state model is weak, the product quickly becomes a pile of messages and logs. Strong structured models made everything better:

tasks
patch proposals
verification plans
workflow steps
review issues
repository action records

3. Sandboxes matter

If you want AI to inspect real software, you need a runtime environment that can deal with actual repositories and build systems. That pushed us toward a much more serious sandbox architecture.

4. AI products feel more real when they are proactive

The moment DevPilot started:

discovering issues in the background
surfacing review opportunities
and routing work without needing every action manually prompted

…it began to feel far more like a teammate than a tool.

5. Verification is essential

Generating a patch is not enough. The real value comes from answering:

Did it actually fix the issue?

That changed how we thought about the whole product.

What's next for DevPilot — The GitLab UI-to-Code Agent

We see DevPilot evolving far beyond a single-task assistant.

Near-term next steps

Stronger real GitLab execution across merge requests, pipelines, and event-driven triggers
Better post-fix verification and regression detection
More robust repository mutation workflows
Smarter background code review discovery
Cleaner cross-repo issue inboxes

Product expansions we want

Review packs that group related discovered issues
Better ranking and scoring for discovered tasks
Repo memory so DevPilot gets smarter about repeated issue patterns
Retry / re-fix loops when verification fails
More advanced sandbox hardening and browser/session scaling
Team and org-level workflows across multiple repositories

Long-term vision

We want DevPilot to become a true AI engineering teammate that can:

inspect live software
surface hidden issues
prepare safe code changes
route them through review
verify outcomes
and continuously help teams ship more confidently

The long-term goal is not just AI that writes code.

It’s AI that helps teams move software from: problem to patch to proof

Built With

chromium
dexie.js
docker
express.js
gemini-3.1-pro-preview
gitlab
gitlab-merge-requests
google-cloud-run
indexeddb
node.js
novnc
playwright
react
tailwind-css
typescript
vite
vnc
websockets

Submitted to

GitLab AI Hackathon

Created by

I designed and built DevPilot end to end.
My contributions included:
product idea and overall architecture
frontend UI/UX design for the dashboard and task workspace
local-first workflow state and task orchestration
sandbox inspection architecture with Cloud Run direction
repository/context loading and code-fix flow design
patch proposal, GitLab handoff, and verification workflow design
background code review discovery system
demo preparation, submission materials, and presentation assets
A big part of my work was shaping DevPilot to feel like an actual AI engineering teammate rather than just a chatbot something that can inspect live apps, reason about repositories, prepare fixes, and verify results through a structured workflow.

Osita Miles

Updates

Osita Miles posted an update — Mar 25, 2026 02:28 PM EDT

I discovered this hackathon on March 11th and started March 13th...this was a fun ride... I couldn't finish as I wanted... Still a lot of things left but I'm really proud of what I could achieve in a short period almost gave up a lot of times. I was recording the demo on my tablet cause my laptop was dead at the moment 15mins left to submission so I couldn't edit had to submit a 9 mins raw video on YouTube.... Hope the judges watches all of it and fast forward all the waiting period in the video, background sounds etc.

Log in or sign up for Devpost to join the conversation.

Osita Miles posted an update — Mar 25, 2026 02:04 PM EDT

Built DevPilot, an AI developer teammate that can inspect live applications, map runtime/UI issues back to likely source files, prepare patch proposals, and route the work into a GitLab-style review flow.

Recent progress:

Added a sandbox-backed inspection workflow
Built a task workspace with agent reasoning, runtime preview, and diff output
Added patch proposal + approval flow
Added post-fix verification flow
Added background code review discovery so DevPilot can surface extra review opportunities across repos
Started wiring real GitLab handoff behavior, including merge request creation and repository mutation flow

The big idea behind DevPilot is simple:

Go from “something looks broken” → to “here is the issue, here is the patch, and here is whether it was actually fixed.”

More updates coming as we keep improving the sandbox, GitLab flow integration, and proactive code review discovery.

Log in or sign up for Devpost to join the conversation.

Osita Miles started this project — Mar 25, 2026 01:50 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.