gemini-pipeline-agent

Inspiration

Your CI dashboard glows red. You click the pipeline. You scroll past the green stages. You click the failed job. You scroll past 1200 lines of successful test output. Somewhere on line 1247 is the assertion message that tells you what actually broke. That last step — finding the actual signal in a long log — is the boring part of every code review I've done in the last five years. The agent skips straight to it.

What it does

gemini-pipeline-agent treats every red pipeline as a triage job. The agent walks the GitLab MCP tools in order:

list_projects to resolve the project name to an ID
list_pipelines with status=failed to find recent failures
list_pipeline_jobs on the most recent failed pipeline to identify which job failed
get_job_log for the failed job to read the captured stdout/stderr

The agent then quotes the assertion message verbatim and proposes the next step.

Output shape:

PROJECT: acme/checkout-api
PIPELINE: 332 feat/whatsapp-no-stream 9f2a1b... failed 612s
FAILED JOB: integration test 274s
ROOT CAUSE: <one sentence drawn from the log>
EVIDENCE: <2-4 bullets quoting specific log lines verbatim>
NEXT STEP: <one concrete fix to try>

How we built it

Google Cloud Agent Builder (ADK) for the agent framework. The whole agent fits in six lines of ADK: one LlmAgent, one McpToolset, a Gemini model, and a system prompt that defines the four-step workflow.
Gemini 2.5 Flash on Vertex AI for reasoning.
GitLab MCP server for tools. The agent talks to the published GitLab MCP server's tool surface (list_projects, get_project, list_pipelines, get_pipeline, list_pipeline_jobs, get_job_log). A stub MCP server ships in the repo with canned realistic projects, pipelines, and Jest-style job logs so the demo runs without a GitLab tenant.
Streamlit for the dashboard.
Cloud Run for hosting.

Challenges we ran into

The first version of the system prompt let the model paraphrase the log instead of quoting it. The fix was an explicit "quote log lines verbatim, including assertion messages" instruction. Subsequent live runs reproduced the assertion message at router.integration.test.ts:42:38 exactly as it appears in the canned log — no paraphrase, no rewording.

Accomplishments that we're proud of

A real Vertex AI Gemini call walked all four GitLab tools in 9 events and identified sendChunk was called 3 times instead of 1 in test/channels/router.integration.test.ts, citing the exact line and file path.
10 passing tests cover the stub server's tool responses + the agent wiring.
This is the sixth substantively-different MCP integration in this hackathon sibling family (Dynatrace, Arize Phoenix, MongoDB, RAG drift, Elastic, GitLab). All six share the same LlmAgent + McpToolset shape; the MCP protocol carried the abstraction every time.

What we learned

When the agent's job is to surface a quote, not a summary, "verbatim" needs to be in the system prompt explicitly. The model defaults to paraphrasing otherwise, which hides the most useful artifact (the exact error text).

What's next for gemini-pipeline-agent

A "compare with last green pipeline" tool that diffs the failed job's log against the previous successful run's log to surface only the new errors.
Auto-tag the merge request with the diagnosed failure class (test-flake / build-failure / lint / deployment).
Plug in the official Dynatrace MCP for cross-correlation: failed pipeline + spiking error rate in prod = a coordinated incident view.

Built With

agent-development-kit
gemini
gemini-2.5
gitlab
gitlab-mcp
google-cloud-agent-builder
google-cloud-run
mcp
python
streamlit
vertex-ai

Updates

Mukunda Katta started this project — May 18, 2026 11:26 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.