🚀 Blink AI: The Future of Autonomous UI Testing & Self-Healing

Inspiration

Software development moves at lightning speed, but UI testing is stuck in the slow lane. Every developer knows the pain: you submit a beautiful Merge Request (MR), but then you spend hours writing brittle Playwright scripts, waiting for review environments, and manually clicking through pages to ensure a button didn't break.

We asked ourselves: "What if the Merge Request could test itself?" What if an agent could look at code, understand the visual impact, spin up a cloud environment, and perform "human-like" visual verification? That is why we built Blink AI—to remove the friction of manual UI QA and let developers focus on building, not clicking.

What it does

Blink AI is an end-to-end autonomous testing ecosystem integrated directly into the GitLab workflow. It doesn't just "chat"; it takes action.

Analyzes MRs: It reads your code changes and summarizes the visual impact.
Generates Test Plans: It dynamically creates focused UI test scenarios based on the diff.
Orchestrates Infrastructure: It uses glab to trigger GCP-powered GitLab Runners, deploying a live Preview URL.
Visual Execution: It invokes an MCP Server to "drive" a browser, capturing GIFs of every action.
Self-Healing (The Fix Flow): If a test fails, a specialized multi-agent flow analyzes the failure, writes the code fix, and pushes a new commit to your branch automatically.

Feature	The Old Way	The Blink AI Way
Test Creation	Manual script writing (hours)	AI-Generated from code diff (seconds)
Execution	Brittle selectors, flaky CI	Vision-based "human-like" interaction
Evidence	Static logs and trace files	Full "Replay GIF" posted to MR notes
Bug Fixing	Manual debugging and re-patching	Autonomous multi-agent self-healing

How we built it

Blink AI is a masterclass in combining the best of Anthropic, Google Cloud, and GitLab Duo.

The Architecture

Tool Flow Diagram

GitLab Duo Custom Agent (The Brain): Built on the GitLab Agent Platform using Anthropic’s Claude Haiku 4.5. It orchestrates the entire workflow, manages the "Fix Flow" logic, and interacts with the developer via MR comments.
MCP Server (The Hands & Eyes): A Python-based Model Context Protocol server hosted on Google Cloud Run. It uses Gemini 3 Flash (Computer Use) and Playwright to visually navigate the app.
GCP Infrastructure (The Muscle):
- GCP GitLab Runners: Scalable executors that build and deploy the test targets.
- Google Cloud Storage (GCS): Stores visual evidence and test artifacts.
- Artifact Registry: Manages the containerized MCP and Demo applications.
The Ecommerce Demo: A React + Vite application that serves as our real-world test target.

Challenges we ran into

The 64K Token Barrier: Early on, sending high-res screenshots as Base64 strings hit the model's limit in seconds. We solved this by implementing a GCS-backed "Stateless" architecture, sending URI references instead of raw data.
Cryptographic Security: Vertex AI’s strict "Thought Signature" checks meant we couldn't just prune conversation history. We had to design a stateless loop that treats every step as a fresh "Turn 1" while maintaining an internal action log.
Synchronous Timeouts: GitLab tools have a 60s timeout, but UI tests take minutes. We transitioned our agent to an Iterative Loop pattern, calling the MCP tool for one test case at a time to keep the connection alive and resilient.

Accomplishments that we're proud of

Zero-Config UI Testing: We successfully demonstrated an agent that can test a "Promo Code" feature from scratch without a single line of pre-written test code.
Visual Evidence: The agent automatically generates an execution_replay.gif for every test, making it incredibly easy for judges and developers to see the AI's logic.
The "Fix Flow": We created a multi-agent state machine (Analyser -> Patcher -> Validator) that can actually write and commit code to fix its own found bugs.

What we learned

The Power of MCP: Using the Model Context Protocol is a game-changer for GitLab. It allowed us to securely bridge GitLab's environment with Google's most powerful Vision models.
Vision over Selectors: We learned that AI agents are much more effective at testing when they "see" the screen like a human, rather than trying to find a hidden div ID in the DOM.
Enterprise Security: We implemented custom Bearer Token middleware to ensure that only authorized GitLab agents can trigger our high-performance Cloud Run executors.

What's next for Blink AI

Green Testing Integration: Optimizing GCP Runner usage to spin down environments the second tests finish, reducing carbon footprint.
Multi-Browser Support: Expanding the MCP server to run Safari, Firefox, and Chrome simultaneously for cross-browser visual regression.
Compliance Auto-Generation: Automatically generating accessibility (a11y) and compliance reports based on the visual state of the MR.

Custom Agent: https://gitlab.com/gitlab-ai-hackathon/participants/33413865/-/automate/agents/1007260/

Custom Flow: https://gitlab.com/gitlab-ai-hackathon/participants/33413865/-/automate/flows

Blink AI: Don't just merge. Blink, and it's tested.