WordPress Health Guardian

What it does

WordPress Health Guardian is an AI-powered agent that monitors and analyzes WordPress site health. Given any WordPress URL, it runs comprehensive checks — uptime, SSL certificates, DNS resolution, and WordPress-specific endpoints — while simultaneously querying Dynatrace for active problems and monitored entities. A Gemini 2.5 Flash agent (via Vertex AI) analyzes all the collected data and generates a structured health report with a score, issue summary, and actionable recommendations. Results are persisted to Firestore for trend analysis, and the entire application is instrumented with OpenTelemetry shipping traces to Dynatrace.

How we built it

The application is built on Google's Agent Development Kit (ADK) with a FastAPI web server deployed on Cloud Run. The agent uses a custom VertexAIGemini model class that forces the Vertex AI backend on GCP, running gemini-2.5-flash in us-central1. The agent has ten function tools covering health checks, Dynatrace queries, report generation, and GCP config detection.

For the Dynatrace integration, we implemented a dual approach:

The official @dynatrace-oss/dynatrace-mcp-server connected via ADK's McpToolset for deep observability queries
Direct Dynatrace API v2 calls using a classic token with problems.read and entities.read scopes for reliable data access

For observability of the agent itself, we added OpenTelemetry instrumentation with an OTLP HTTP exporter that ships traces to Dynatrace's /api/v2/otlp/v1/traces endpoint, using a dedicated token with the openTelemetryTrace.ingest scope.

Google Cloud services used:

Vertex AI — Gemini 2.5 Flash model inference
Firestore — Health check history persistence
Secret Manager — Four secrets (Dynatrace MCP token, classic API token, OTel token, scheduler auth)
Cloud Scheduler — Weekly automated health checks (Mondays 8AM UTC)
Cloud Run — Serverless container hosting
Artifact Registry — Container image storage
Cloud Build — CI/CD pipeline

Challenges we ran into

Vertex AI rate limits: The free tier for gemini-2.5-flash has a 2-requests-per-minute limit on certain quotas, causing frequent 429 errors. We solved this by adding a try/except wrapper around the agent call that falls back to direct health checks when the model is unavailable, ensuring users always receive a report.

ADK data serialization: The ADK framework's tool call mechanism was passing Python repr strings (single quotes, True/False booleans) instead of JSON to the report generation function. We fixed this by changing parameter types from str to dict and adding a _safe_parse() fallback that tries json.loads first, then ast.literal_eval.

Dynatrace MCP compatibility: The MCP server failed on Cloud Run's Linux environment because the initial npx.cmd command is Windows-specific. We fixed it by switching to npx (without .cmd), which works on both platforms.

OpenTelemetry configuration: The initial OTel setup used the platform token (lacking OTLP ingest scope) and had an incorrect endpoint URL format. We created a dedicated classic token with openTelemetryTrace.ingest scope and normalized the endpoint URL to use the live.dynatrace.com domain.

Accomplishments we're proud of

A fully functional multi-service architecture spanning 7 GCP services + Dynatrace
Graceful fallback behavior when any component (Vertex AI, Dynatrace) is unavailable
Real Dynatrace API integration returning live problem and entity data (not just "MCP available" placeholders)
OpenTelemetry traces shipping to Dynatrace for agent-level observability
Clean, responsive web UI with markdown-rendered health reports and four dedicated tabs

What we learned

Building agents with Google ADK requires careful attention to tool function signatures — the framework serializes arguments in specific formats and mismatches cause silent failures
Dynatrace's platform token and classic tokens serve different purposes and scopes; you need at least three distinct tokens for full integration (MCP, API v2 data, and OTel ingest)
Vertex AI free tier rate limits (2 RPM for gemini-2.5-flash) are significantly more restrictive than the equivalent Gemini API limits, making fallback logic essential for any production-like deployment
Cloud Run's Linux environment differs from local Windows development in subtle ways (binary names, path resolution, npm modules with native dependencies)

Built With

artifact-registry
cloud-build
cloud-run
cloud-scheduler
css
dynatrace-api-v2
dynatrace-mcp-server
fastapi
firestore
gemini-2.5-flash
google-adk
html5
javascript
opentelemetry
secret-manager

Updates

Tochukwu Mesigo started this project — Jun 08, 2026 01:07 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.