As a data engineering intern, I spent time manually checking pipeline logs, hunting down failures, and writing incident reports. It was tedious and time-consuming. I wanted to build an agent that does all of that automatically, not just answers questions, but actually takes action. Datapilot is the tool I wish I had on the job.

Datapilot monitors your ETL pipelines in real time, detects failures and anomalies, and lets you interact with your data using natural language. You can ask it "what failed last night?" or "generate an incident report for the Sales Ingest pipeline" and it queries MongoDB, reasons over the data with Gemini, and responds with real answers. The dashboard shows live pipeline health, and the agent chat goes beyond conversation; it actually takes action.

I built Datapilot using Google ADK with Gemini 2.5 Flash as the agent brain, connected to MongoDB Atlas via the MongoDB MCP server. The backend is Python FastAPI deployed on Railway, the frontend is Next.js deployed on Vercel, and all pipeline data lives in MongoDB Atlas. The agent uses the MCP server to query collections directly and reason over the results.

Getting the MongoDB MCP server working with Google ADK took significant debugging. Google Cloud organization policies blocked API key authentication, forcing us to use Application Default Credentials and eventually a personal account for the free tier. Deploying a monorepo to Railway with both the backend and agent folders required careful Procfile configuration.

We built a fully working AI agent that queries a real database, reasons over the results, and responds intelligently, all live on the internet. The agent moves beyond chat and actually takes action. The dashboard and agent chat are both fully functional on the live deployment.

How to integrate Google ADK with MCP servers, how to deploy Python monorepos on Railway, and how authentication works across Google Cloud services. We also learned that Gemini 2.5 Flash on the free tier works great for agent use cases.

Rate limiting and authentication for the live demo, support for more pipeline types, Slack and email alerting when pipelines fail, and expanding to support more data sources beyond MongoDB.

Built With

Share this project:

Updates