Project Story

🌪️ The Problem: The "Success Disaster"

Tableau is a victim of its own success. In every enterprise, adoption starts exciting—everyone builds dashboards. But fast forward two years, and you have a "Data Swamps."

  • Thousands of stale workbooks ("Zombies") clogging the server.
  • Storage costs spiraling out of control.
  • Users don't know which data to trust.
  • Admins are too terrified to delete anything for fear of breaking a critical report.

We call this Data Entropy. Human admins can’t keep up with the sprawl. They need help. They don't need a script; they need a teammate.

🤖 The Solution: Autonomous Data Steward (ADS)

ADS is not a cleanup script. It is a safety-first cognitive agent. It acts as a virtual employee that works 24/7 to reverse data entropy. It doesn't just "delete old files"—it thinks about them first. It evaluates value, risk, and cost before taking a single action.

🧠 How It Works: The SADA Loop

Unlike brittle automation scripts, ADS operates on a cognitive cycle we call SADA:

  1. SENSE (👀): The agent scans the Tableau ecosystem via the Metadata API, building a real-time map of usage, lineage, and "staleness."
  2. ANALYZE (🧠): It identifies "Zombie" content (e.g., high storage, zero views in 180 days). It calculates the ROI of removal (e.g., "Archiving this saves $47/year").
  3. DECIDE (⚖️): This is where we innovate. The agent calculates a "Blast Radius Score."
  4. Is this used by the CEO? (VIP Iron Dome protection).
  5. Is it embedded in Salesforce?
  6. If the risk is too high, it pauses. If the risk is low, it proceeds.

  7. ACT (⚡): It executes the decision—archiving the workbook to cold storage and notifying the owner via Slack with a restoration link.

🛡️ Built for the Enterprise: Safety First

The biggest fear with AI is that it will go rogue. We built ADS with an "Iron Dome" safety architecture:

  • Blast Radius Limit: The agent will refuse to touch any asset with a high dependency score.
  • VIP Protection: Tagging a project as Project_Titan makes it immune to the AI's decisions.
  • Dry-Run Default: The agent runs in simulation mode by default, generating "impact reports" without touching a single file.

💻 Tech Stack

  • Core: Python
  • Integration: Tableau Server Client (TSC), Tableau Metadata API
  • Cognition: OpenAI API (for narrative reasoning and decision explanation)
  • Interface: Rich (CLI), Streamlit (Dashboard), Slack SDK (Notifications)

🚀 The Impact

In our testing simulations, ADS didn't just clean files; it generated value.

  • Detected: 15 "Zombie" workbooks in under 30 seconds.
  • Projected Savings: Calculated over $2,400 in annualized storage and compute waste.
  • Time Saved: Freed up approx. 15 hours/week of manual admin auditing.

ADS turns Governance from a chore into an autonomous advantage. It’s the data steward that never sleeps.

Built With

  • cerberus
  • fastapi
  • graphql
  • jira-api
  • openai-api
  • pytest
  • python
  • rich
  • slack-api
  • streamlit
  • tableau
  • tableau-server-client
Share this project:

Updates