Inspiration

Agentic Data Governance aims to bridge the gap between undocumented data and a data science agent.

What it does

The Data Governance agent guides the user through a series of questions to create a Data Dictionary from a sample data and create a BigQuery table that later can be used by a data science agent to create more relevant visualizations and predictive models. Unfortunately I run into some issues at the end when attempting to integrate it with OpenWebUI so I am currently working through a bug that does not let the interface communicate correctly with the agent.

How we built it

The project revolves around a multi-agent system, featuring a Data Science Agent (copied from the sample data science agent) and a Data Governance Agent built with Google's Agent Development Kit (ADK), accessible to users via OpenWebUI.

The entire cloud infrastructure on Google Cloud Platform (GCP) is provisioned as code by Terraform, which sets up a secure and scalable Google Kubernetes Engine (GKE) cluster.

Application deployment onto the GKE cluster is then managed by Helm. Helm charts define, version, and streamline the installation and upgrade process within Kubernetes.

Challenges we ran into

Because Google ADK web does not have a built in method to authenticate I had concerns about exposing the web application to the public internet. For this reason I chose to use OpenWebUI as the front end.

There is no native integration between Google ADK and OpenWebUI which meant I had to look for a way to bridge the two. The method that made sense to me was to use Ollama to translate the agent output seamlessly. But this also had no integration. Finally I decided to sit LiteLLM between the agent and Ollama. This meant I had to use 2 separate middleware.

Accomplishments that we're proud of

Built an agent that successfully guides a user to build a data dictionary that is then saved as a BigQuery table that the agent can use.

Personally I am also proud to have launched the infrastructure as code using Terraform and Helm Charts for a functional CI/CD workflow.

What we learned

As my first time creating an agent, I was very impressed by how flexible Google ADK felt to develop. By typing the sample data science agent code line by line I was able to understand how the agents can be put together and tools created.

What's next for Agentic Data Science

As a data scientist myself I am looking forward to expand the capacity of this app and agent by investing more time into evaluating it, testing more of the limits tool calling, and continuing to version more features with Helm Charts.

Built With

Share this project:

Updates