Inspiration

Production incidents cost companies thousands per minute. Most response time is wasted on manual, repetitive investigation — the same steps every time. CloudGuardian automates that entire process.

What it does

CloudGuardian is a 7-agent autonomous reliability engineer. When an incident is reported, it automatically investigates via Dynatrace, triages root cause, matches historical patterns, proposes remediation options with risk scores, waits for human approval, executes the fix, and generates a full postmortem — all in under 2 minutes.

How we built it

Built with Google Cloud Agent Builder framework (ADK 2.1.0) — the official SDK for building and deploying agents on Vertex AI Agent Platform. The cloudguardian-supervisor agent is registered on Agent Platform (resource ID: 797357036370132992) and the full 7-agent system runs via ADK's runtime on Cloud Run. Dynatrace MCP Server (20+ tools: list_problems, execute_dql, list_vulnerabilities, find_entity_by_name) 7 specialized agents: Supervisor, Watcher, Triage, Learning, Remediation, Executor, Reporter OpenTelemetry + GoogleADKInstrumentor shipping traces to Dynatrace Cloud Run for hosting, nginx proxy for the web UI CloudGuardian implements Dynatrace's recommended observability pattern for Google ADK agents — using OpenTelemetry with GoogleADKInstrumentor to capture full end-to-end traces across the agentic AI stack, as documented on the Dynatrace Hub Vertex AI page. Modular Python architecture with per-agent files.

Closed loop integration

Dynatrace monitors infrastructure → CloudGuardian reads from Dynatrace via MCP → CloudGuardian fixes infrastructure → Dynatrace observes CloudGuardian via OTel. Both directions proven with live data.

Challenges

mcp module incompatibility with Python 3.13 in Vertex AI Agent Engine — solved by switching to Cloud Run with custom Dockerfile Windows CMD vs Linux binary path differences for MCP stdio transport — solved with platform detection in mcp_wrapper.py BatchSpanProcessor flushing before Cloud Run container shutdown — solved with GoogleADKInstrumentor which hooks natively into ADK

Accomplishments

Real Dynatrace MCP tool calls confirmed in ADK Events tab cloudguardian-supervisor registered on Vertex AI Agent Platform (Agent Builder deployment confirmed) Full 7-agent chain working end to end with human approval gate Distributed traces visible in Dynatrace showing every agent_run, call_llm, and execute_tool span Live web UI deployed on Cloud Run

What we learned

GoogleADKInstrumentor is the correct way to instrument ADK agents for Dynatrace — it automatically captures every tool call and model completion without manual span creation.

What's next

Real Dynatrace entity resolution for production services Slack/PagerDuty integration for approval notifications Multi-environment support (staging vs production) DukanPage integration for MSME incident monitoring.

Built With

cloud-run
dynatrace-mcp
fastapi
gemini-2.5-flash
google-adk
google-cloud-agent-builder
googleadkinstrumentor
nginx
opentelemetry
python
vertex-ai

Submitted to

Google Cloud Rapid Agent Hackathon

Created by

I independently designed and built the full CloudGuardian system end-to-end. This included architecting the 7-agent multi-agent system using Google Cloud ADK, integrating the Dynatrace MCP server via a custom stdio wrapper to handle cross-platform JSON-RPC communication, deploying the backend on Cloud Run with Node.js for MCP stdio transport, building the web UI with a FastAPI reverse proxy to handle CORS, instrumenting the system with OpenTelemetry shipping traces to Dynatrace, registering the agent on Vertex AI Agent Platform via Agent Studio, and producing the demo video. I debugged and resolved multiple platform-specific issues including Windows stdio pipe limitations, Python 3.13 MCP module compatibility in Agent Engine, Cloud Run session management, and CORS configuration across two Cloud Run services.

Surendra Kumar
Full Stack developer building Scalable mobile platforms. Shipped Gfood, ServeNow & Zesto

Updates

Surendra Kumar started this project — Jun 08, 2026 03:25 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.