Inspiration

Database migrations are one of the most dangerous operations in software development, a single DROP COLUMN or data type change can silently break production, cause data loss, and take hours to recover from. Yet most teams rely on manual code reviews to catch these issues, which is inconsistent and easy to miss under deadline pressure.

We wanted to build something that catches these risks automatically, before they ever reach production with zero extra effort from the developer.


What it does

DB Migration Safety Checker is a GitLab Duo Flow that automatically:

  • Detects SQL migration files in every new merge request opened
  • Classifies each schema change as SAFE, RISKY, or DANGEROUS
  • Blocks dangerous MRs from merging and adds a do-not-merge label
  • Creates a blocking issue with a safe migration plan and rollback guidance
  • Logs all changes to BigQuery for historical tracking
  • Generates team insights using Gemini based on the last 30 days of migration history, surfacing patterns like repeated destructive changes on critical tables

How we built it

The system has two main components:

1. GitLab Duo Flow A flow YAML that triggers when an MR is mentioned. It reads the diff, identifies schema changes, classifies risk, labels the MR, creates blocking issues, and posts a detailed safety report, all automatically.

2. Cloud Run Webhook + Poller A Flask service deployed on Google Cloud Run that:

  • Polls GitLab every 5 minutes via Cloud Scheduler
  • Auto-triggers the flow on new Merge Requests with SQL changes
  • Parses the flow's safety report using Gemini
  • Logs all changes to BigQuery
  • Calls Gemini (gemini-2.5-flash via Vertex AI) to generate team-level insights from migration history
  • Posts the insight as a follow-up comment on the MR

Tech Stack: GitLab Duo Flow, Google Cloud Run, Cloud Scheduler, BigQuery, Vertex AI (Gemini 2.5 Flash), Flask, Python


Challenges we ran into

  • Webhook limitations (GitLab project restrictions) — We initially planned to use GitLab webhooks to trigger the flow in real time. However, org-level restrictions on the hackathon project prevented configuring webhooks. To work around this, we used Cloud Scheduler + Cloud Run polling, which reliably detects new MRs and triggers the flow.
  • Model availability — The initially targeted Gemini model (gemini-2.0-flash-001) was unavailable for new GCP projects as of March 2026. We had to identify gemini-2.5-flash as the correct model to use.
  • Org policy restrictions — Our GCP project had org-level policies blocking API key creation, which required switching to Vertex AI service account authentication instead.
  • Worker timeouts — Running a background polling thread inside Gunicorn on Cloud Run caused repeated worker timeouts. We solved this by switching to Cloud Scheduler triggering a /poll endpoint instead of relying on background threads.
  • Inconsistent report parsing — The GitLab Duo Flow generates safety reports in slightly different formats each time due to LLM non-determinism. We replaced brittle regex parsing with a Gemini-powered parser that correctly extracts structured data from any report format.
  • Cloud Run scaling — Cloud Run scales to zero between requests, killing background threads. Setting min-instances 1 and using Cloud Scheduler solved this reliably.

Accomplishments that we're proud of

  • Built a fully automated end-to-end pipeline — from MR creation to safety report to team insight — with zero manual intervention
  • Used Gemini not just for insights but also as an intelligent parser, making the system resilient to LLM output variability
  • The team insight feature goes beyond just checking one migration — it detects long-term patterns across the team's entire migration history
  • The blocking issue generated by the flow includes a safe migration plan and rollback SQL, making it actionable rather than just a warning

What we learned

  • Cloud Run is not suitable for long-running background threads. Cloud Scheduler with HTTP triggers is the correct pattern for scheduled tasks on serverless infrastructure
  • LLM outputs should never be parsed with rigid regex, using another LLM call to extract structured data is far more robust
  • GitLab Duo Flows are powerful for automating MR workflows but require careful prompt engineering to produce consistent output formats
  • Vertex AI model availability varies by project age and region, always verify model access early in development

What's next for DB Migration Safety Checker

  • Webhook integration — Replace polling with a native GitLab webhook for instant triggering instead of waiting up to 5 minutes
  • Multi-project support — Track migration patterns across multiple repositories in the same organization
  • Slack/email alerts — Notify the team when dangerous migration patterns are detected
  • Dashboard — A BigQuery-powered dashboard showing migration risk trends over time
  • Auto-fix suggestions — Have Gemini not just identify dangerous migrations but automatically generate and commit safe alternatives as a new file in the MR

Built With

  • bigquery
  • fastapi
  • gitlab
  • gitlab-duo-agent-platform
  • google-cloud-run
  • google-cloud-scheduler
  • python
  • vertex-ai-(gemini-2.5-flash)
Share this project:

Updates