Inspiration
Your CI dashboard glows red. You click the pipeline. You scroll past the green stages. You click the failed job. You scroll past 1200 lines of successful test output. Somewhere on line 1247 is the assertion message that tells you what actually broke. That last step — finding the actual signal in a long log — is the boring part of every code review I've done in the last five years. The agent skips straight to it.
What it does
gemini-pipeline-agent treats every red pipeline as a triage job. The agent walks the GitLab MCP tools in order:
list_projectsto resolve the project name to an IDlist_pipelineswithstatus=failedto find recent failureslist_pipeline_jobson the most recent failed pipeline to identify which job failedget_job_logfor the failed job to read the captured stdout/stderr
The agent then quotes the assertion message verbatim and proposes the next step.
Output shape:
PROJECT: acme/checkout-api
PIPELINE: 332 feat/whatsapp-no-stream 9f2a1b... failed 612s
FAILED JOB: integration test 274s
ROOT CAUSE: <one sentence drawn from the log>
EVIDENCE: <2-4 bullets quoting specific log lines verbatim>
NEXT STEP: <one concrete fix to try>
How we built it
- Google Cloud Agent Builder (ADK) for the agent framework. The whole agent fits in six lines of ADK: one
LlmAgent, oneMcpToolset, a Gemini model, and a system prompt that defines the four-step workflow. - Gemini 2.5 Flash on Vertex AI for reasoning.
- GitLab MCP server for tools. The agent talks to the published GitLab MCP server's tool surface (
list_projects,get_project,list_pipelines,get_pipeline,list_pipeline_jobs,get_job_log). A stub MCP server ships in the repo with canned realistic projects, pipelines, and Jest-style job logs so the demo runs without a GitLab tenant. - Streamlit for the dashboard.
- Cloud Run for hosting.
Challenges we ran into
The first version of the system prompt let the model paraphrase the log instead of quoting it. The fix was an explicit "quote log lines verbatim, including assertion messages" instruction. Subsequent live runs reproduced the assertion message at router.integration.test.ts:42:38 exactly as it appears in the canned log — no paraphrase, no rewording.
Accomplishments that we're proud of
- A real Vertex AI Gemini call walked all four GitLab tools in 9 events and identified
sendChunkwas called 3 times instead of 1 intest/channels/router.integration.test.ts, citing the exact line and file path. - 10 passing tests cover the stub server's tool responses + the agent wiring.
- This is the sixth substantively-different MCP integration in this hackathon sibling family (Dynatrace, Arize Phoenix, MongoDB, RAG drift, Elastic, GitLab). All six share the same
LlmAgent+McpToolsetshape; the MCP protocol carried the abstraction every time.
What we learned
When the agent's job is to surface a quote, not a summary, "verbatim" needs to be in the system prompt explicitly. The model defaults to paraphrasing otherwise, which hides the most useful artifact (the exact error text).
What's next for gemini-pipeline-agent
- A "compare with last green pipeline" tool that diffs the failed job's log against the previous successful run's log to surface only the new errors.
- Auto-tag the merge request with the diagnosed failure class (test-flake / build-failure / lint / deployment).
- Plug in the official Dynatrace MCP for cross-correlation: failed pipeline + spiking error rate in prod = a coordinated incident view.

Log in or sign up for Devpost to join the conversation.