Inspiration
CI/CD has transformed software development by turning code review, version control, and quality checks into a structured pipeline. But design, despite being a core pillar of product development, still lacks an equivalent workflow. Designers still depend heavily on comments, manual reviews, and repeated visual checks to confirm whether a design really aligns with a design system.
We were inspired by the belief that design deserves its own pipeline-native review experience. As Figma and GitHub begin introducing more Git-like workflows into design, we see a major gap that remains unfilled: design review itself has not yet been fully reimagined through the CI/CD paradigm. PixelPipeline is our attempt to explore that missing layer on GitLab and make the platform more compelling not only for developers, but also for designers.
What it does
PixelPipeline brings design review into a GitLab-native CI/CD workflow. Using the GitLab Duo Agent platform, it submits a Figma design draft into an automated review pipeline, compares the draft against a target design system, and detects where the design is misaligned. Instead of relying only on explicit structure, it performs semantic inference to understand what a UI element is intended to be, identify the closest matching design-system component, and generate suggested fixes to bring it back into alignment, including adjustments to color, shape, typography, spacing, and other visual properties.
On the user side, we built a Figma plugin so the workflow fits naturally into a designer’s existing process. Designers can stay inside Figma, submit a review directly from the plugin, and wait for the GitLab pipeline to finish running in the background. Once the review is complete, the plugin pulls the generated fix suggestions back into Figma, where the designer can review and apply them with one click.
How we built it
We built PixelPipeline as a three-part architecture connecting Figma, GitLab, and a design-system service layer.
First, we created a custom GitLab flow powered by the GitLab Duo Agent platform, with four AgentComponents working as a pipeline. After a designer submits a review, the design draft is exported from Figma as JSON and committed into a merge request. The intake agent reads the MR and retrieves the design source. The component_classifier agent then uses LLM reasoning to infer what each visual element is supposed to represent. In design, a button may appear as a pill-shaped rectangle with text, a nested frame, or simply something labeled button or btn, so we cannot rely on code-like classes alone. Instead, the agent looks at visual composition, hierarchy, position, dimensions, and naming conventions to infer the semantic component type. The component_patch_planner agent then combines the original design draft with the target design-system properties such as typography, color tokens, and component definitions to produce a code-like merge plan. Finally, migration_writer turns that output into two things: a human-readable design review for the merge request, and a machine-readable JSON fix spec that can be sent back to the designer and applied automatically.
Second, for the designer-facing experience, we built a Figma plugin using HTML/JavaScript. The plugin allows designers to stay entirely inside Figma: with one click, they can submit the selected draft, call the GitLab API, create the JSON commit, and open the merge request. Once the GitLab pipeline finishes, the same plugin fetches the generated fixes and lets the designer review and apply them with one click.
Third, on the backend, we built a separate Google Cloud Function service that exposes an API for parsed and scraped design-system data. This service provides the pipeline with fast access to normalized design-system tokens and component specifications, so the GitLab agents can compare raw design drafts against real system definitions without having to scrape everything during each review.
Challenges we ran into
One major challenge was that design elements are not implemented in one canonical way. Unlike code, where classes and types are usually explicit, the same UI intent in a design draft can be represented through many different combinations of frames, rectangles, text layers, and naming patterns. That meant we could borrow the CI/CD paradigm from software engineering, but we could not simply copy code-linting logic into design review. To address this, we relied on LLM-based semantic inference: instead of only checking structure, we used signals like component names, layout position, hierarchy, and likely function to infer what an element was meant to be and map it to the closest design-system component.
A second challenge appeared when we moved from review to automatic alignment. Even if we could semantically match a draft element to the right design-system component, Figma libraries use their own internal keys and component references, and those are not trivial to recover without scanning the entire library. That made full component-level replacement much harder than expected. To make one-click fixes practical, we changed strategy: instead of trying to replace whole components, we shifted to property-level alignment. In other words, rather than swapping an entire element for a library component, we align the element’s properties, such as color, typography, sizing, spacing, and shape, to the values defined by the target design system. This turned out to be a much better fit for structured reasoning and allowed the pipeline to generate fixes that were both realistic and actionable.
The third challenge was data access inside the pipeline. The review pipeline could not depend on live external scraping, and in practice, scraping design systems is highly system-specific anyway: some systems expose data through GitHub, some provide JSON directly, and others require importing libraries into Figma and querying them through Figma APIs. Because of that, we built a separate scraping service to collect and normalize design-system tokens and component metadata ahead of time, store them in Google Cloud Storage, and serve them through a Google Cloud Function API. This let the GitLab pipeline load pre-scraped design-system data quickly and reliably before each review run.
Accomplishments that we're proud of
We are especially proud of pushing the CI/CD and automated pipeline paradigm into design review, a space that still remains largely unexplored. While recent AI progress has accelerated design-to-code workflows, the reverse direction, bringing structured automation, semantic review, and system alignment back into design, is still full of open problems. PixelPipeline explores that gap by treating design review not as a loose comment thread, but as a pipeline-native process: a design draft can be submitted, reviewed, analyzed, and returned with actionable fixes through GitLab.
We are also proud that we did not stop at simple string matching or snapshot-based versioning. Instead, we used LLM reasoning to infer semantic intent from design elements and map them to the closest design-system meaning, even when the draft is messy, detached, or not formally structured. That allowed us to move beyond surface-level review and toward automatic alignment with design-system standards in a way that feels much closer to how designers actually work.
Another accomplishment was making the whole experience usable for designers, not just technically impressive. By packaging the workflow into a Figma plugin, we made it possible for designers to stay inside Figma, submit reviews, wait for the GitLab pipeline to run, and apply suggested fixes with minimal friction. In other words, we did not just build backend intelligence, we built a bridge that makes a code-style review paradigm feel natural in a design-native environment.
Finally, we are proud that the project works as a real end-to-end system across multiple layers: a Figma plugin for submission and fix application, a custom GitLab Duo Agent flow for semantic review and patch planning, and a cloud-backed design-system service for reusable token and component data. Bringing all of those parts together into one coherent workflow was a big accomplishment in itself.
What we learned
We learned that design review cannot simply copy code review. The CI/CD mindset is valuable, but design requires a different technical approach because UI drafts are not as explicitly structured as code.
We also learned that semantic understanding matters more than structural matching. For design automation to be useful, the system has to infer what an element is intended to be, not just what layers it is made of.
Another key lesson was that property-level alignment is more practical than full component replacement. In real design tools and libraries, direct component swapping is often hard to operationalize, while aligning properties like color, typography, spacing, and shape already delivers meaningful value.
We learned that UX matters just as much as the intelligence behind the system. If the workflow does not fit naturally into how designers already work, even strong automation will struggle to gain adoption.
Finally, we learned that design systems are one of the best entry points for AI automation in design. Because they define shared standards for visual and interaction patterns, they create the structure needed to make automated review and alignment genuinely useful.
What's next for PixelPipeline
Next, we want to expand PixelPipeline beyond a single design-system demo into a broader design QA platform. That means supporting more design systems and making onboarding easier, including systems that are less strictly tokenized and more guideline-driven, such as Apple HIG, where the challenge is not just matching fixed values but interpreting higher-level design principles.
We also want to grow from design-system alignment into a more complete design review pipeline. UX design is much broader than visual consistency alone. Questions like navigation flow, interaction clarity, and overall user experience could also benefit from LLM reasoning combined with GitLab Duo Agent workflows and a structured CI/CD-style review process.
Another next step is improving multimodal semantic inference. Today, much of the reasoning depends on structured signals such as color values, coordinates, hierarchy, and layout relationships. In the future, we want to bring rendered screenshots and visual outputs into the loop so LLMs can compare actual UI appearance against design-system references more reliably.
We also see strong potential in moving from simple suggestions toward human-in-the-loop automation. Our goal is not to remove designers from the process, but to help them adopt a more powerful way of managing design changes, one that brings the benefits of review pipelines, controlled iteration, and collaborative quality checks into design work itself.
Longer term, we want to connect this into an end-to-end design-to-review-to-code workflow, and help make GitLab a real home for design collaboration, not just engineering collaboration.
Built With
- ci/cd
- google-cloud
- google-cloud-function
- html/css
- javascript
- json
- python
- yaml
Log in or sign up for Devpost to join the conversation.