Fivetran Challenge - Short Submission Content
Inspiration
Databricks Unity Catalog and Google Cloud don't connect natively. Organizations can't use GCP AI for governance. We built the bridge: custom Fivetran connector + BigQuery + Gemini AI agents that automate documentation, compliance, and search - saving millions.
What it does
Custom Fivetran connector syncs Unity Catalog metadata to BigQuery. Four AI agents powered by Vertex AI Gemini provide: natural language search, 1.5-second auto-documentation,and data quality monitoring. Saves manual governance time.
How we built it
Built Python connector with Fivetran SDK → Unity Catalog REST API →Fivetran Pipeline→ BigQuery pipeline → Four specialized Gemini 2.5 Flash AI agents → Streamlit dashboard. Implemented incremental sync, prompt engineering, caching, and containerized deployment. Optimized for speed (<1s) and cost (80% reduction).
Challenges we ran into
API rate limits (solved: batching + backoff), BigQuery schema design (denormalized), Gemini consistency (prompt engineering + temperature tuning), Fivetran state management (cursor pagination), cost optimization (Gemini Flash + caching), secure deployment (Docker + env vars).
Accomplishments that we're proud of
First Databricks-GCP governance integration. faster documentation. True AI agents, not chatbot. Production-ready code. 80% cost reduction. Open source (MIT). Solves real enterprise pain at scale. Complete end-to-end solution.
What we learned
Fivetran SDK patterns, BigQuery optimization, Gemini prompt engineering, AI agent architecture, enterprise integration. Learned: governance pain is universal, speed/cost beats perfection, specialized agents > general AI, integration gaps = opportunity. $15B market underserved.
What's next
Data lineage, automated policies, multi-platform (Snowflake, BigQuery), dbt integration, RBAC, anomaly detection. Vision: Autonomous Data Steward AI that makes governance decisions automatically. Mission: Make enterprise governance effortless through AI agents.
Built With
- bigquery
- databricks-unity-catalog
- docker
- fivetran-sdk
- gemini-2.5-flash
- google-cloud
- plotly
- rest-apis
- streamlit
- vertex-ai
- ython
Log in or sign up for Devpost to join the conversation.