AI Data Quality Guardian

AI Data Quality Guardian

About the Project

I created AI Data Quality Guardian while preparing for the Tableau Developer Challenge. I was learning Tableau from scratch, exploring how dashboards work under the hood, and I quickly discovered something surprising:

Tableau gives you amazing visualizations but it does not warn you when the underlying data is broken.

If a metric suddenly drops to zero, becomes negative, or stops refreshing, Tableau will display it silently. You only notice the issue if you are lucky enough to see it manually.

This gave me a clear idea:

What if I built a system that checks Tableau dashboards automatically, like an AI-powered quality inspector?

That thought became the foundation of this project.

Inspiration

My inspiration came from curiosity. I wanted to understand how Tableau works behind the scenes especially its APIs, authentication flows, metadata, and data export capabilities.

While experimenting, I realized that: • Tableau exposes a lot of internal information through the REST API and GraphQL Metadata API, • but almost nobody uses it for automated data quality monitoring.

So instead of manually inspecting dashboards, I wanted a tool that logs in, fetches everything, analyzes metrics, detects issues, and reports them automatically.

How I Built It

I structured the project as a full automated pipeline around Tableau Cloud APIs:

REST API • sign-in with personal access tokens • fetch workbooks and views • export summary data from dashboards
Metadata API (GraphQL) • fetch lineage, fields, datasource info • enrich dashboards with semantic context
Data Fetcher

A flexible parser that converts Tableau’s CSV summaries into machine-readable metrics. This required careful handling because every dashboard exports a completely different structure.

Data Quality Engine

I implemented several rule-based validators: • null / zero checks • negative values • flatline detection • unexpected extremes • sudden jumps and drops

AI Insights

If an OpenAI key is available, the system uses it to: • explain issues, • suggest likely causes, • propose next steps. Otherwise, it works in fallback mode.

Automated Test Generator

The system transforms dashboard metrics into pytest regression tests, so future changes can be validated automatically.

Multi-channel Alerts

All results can be sent to: • Slack (Block Kit) • Email (SMTP) • JIRA Cloud (automatic ticket creation)

What I Learned

This project taught me an enormous amount about: • Tableau REST API and authentication flows • Using GraphQL to explore metadata and lineage • Handling inconsistent CSV formats from different dashboards • Robust API client design • Building a modular architecture for a real system • Detecting anomalies with simple yet effective statistical techniques • Integrating notifications across Slack, JIRA, and email • CI/CD automation using GitHub Actions

It was a deep dive into APIs, data parsing, AI-enhanced logic, and automation workflows.

Challenges I Faced

CSV format differences

Each dashboard exports unique columns, sometimes mixing strings, numbers, percentages, or empty cells. I had to design a parser that could survive anything.

Metadata API returning incomplete data

Some views have no lineage. I added fallbacks so the system never breaks.

Authentication errors

Tableau Cloud returns XML sometimes, JSON other times. Debugging and normalization were essential.

Test Generation

Converting dashboard values into valid Python test code required clever formatting and sanitization.

CI/CD

Getting the solution to run inside GitHub Actions required building a minimal environment with secrets and API access.

Final Result

The system now: • logs in to Tableau Cloud, • extracts dashboards, metrics, and metadata, • evaluates data quality, • detects anomalies, • generates AI insights, • builds regression tests, • and sends professional alerts across multiple channels.

All automatically a complete monitoring tool built from scratch, for real dashboards, by a single developer.

Built With

3.13
api
cloud
csv
dotenv
graphql
pat
pytest
python
requests
rest
tableau
xmltodict

Updates

Marcin Gwara started this project — Nov 21, 2025 05:41 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.