DevStreamAI — Automated CI/CD Debugging with AI

Inspiration

CI/CD failures are unpredictable, repetitive, and time-consuming. Engineers often spend hours debugging logs, identifying root causes, and applying similar fixes across multiple repositories. We wanted to build an automated DevOps system that not only detects failures but understands them — and then actively fixes them. With GCP Vertex AI, Confluent Cloud, and serverless infrastructure, we saw an opportunity to automate CI debugging end-to-end.


What it does

DevStreamAI listens to CI/CD failure events coming from a Confluent Cloud Kafka topic, analyzes them through an AI engine powered by GCP Vertex AI, and generates human-readable explanations along with patch diffs.

The system then automatically:

  • Creates GitHub pull requests
  • Applies AI-generated patches
  • Notifies developers via Slack or email

A Streamlit dashboard displays CI failure events, explanations, and PR activity in real time.


How we built it

Confluent Cloud (Kafka)

Acts as the central real-time streaming pipeline.
CI pipelines publish failure logs to Kafka, and the consumer triggers the full AI and automation workflow. DevStreamAI uses three Kafka topics in Confluent Cloud:

  • ci_failures – Streams raw CI/CD failure logs.
  • ci_pr_updates – Emits updates about PR creation, merge status, and patch operations.
  • ci_ai_fix – Stores AI-generated explanations and code patches.

GCP Vertex AI

Handles the intelligence layer:

  • Parsing CI failure logs
  • Detecting root causes
  • Generating human-readable explanations
  • Generating code patch diffs

GCP Firestore

Firestore stores configuration and metadata:

  • Repository list
  • Project-to-repository mapping
  • Dynamic system settings

Firestore makes the system scalable across multiple repositories.

FastAPI Backend on Cloud Run

A containerized FastAPI backend deployed on Cloud Run.

It exposes secure endpoints for:

  • AI processing
  • GitHub automation
  • Dashboard data retrieval

Cloud Run API URL:
https://devstream-backend-176657413002.us-central1.run.app

GitHub Automation Layer

Uses GitHub APIs to:

  • Create branches
  • Apply AI-generated patches
  • Open automated pull requests

Ensures end-to-end CI fix automation.

Streamlit Dashboard

A real-time dashboard that displays:

  • Failure logs
  • AI explanations
  • Patch diffs
  • PR creation activity

Everything updates live as Kafka events arrive.

GCP Compute Engine VM

A Compute Engine VM hosts:

  • Kafka consumer
  • Streamlit dashboard
  • NGINX reverse proxy

It runs continuously to handle workloads unsuited for Cloud Run, such as:

  • Long-running Kafka consumers
  • Real-time dashboard updates
  • Background processing

Challenges we ran into

  • Managing sensitive credentials safely using GCP Secret Manager and environment variables.
  • Coordinating a multi-cloud architecture (GCP + Confluent + GitHub) with consistent retries and ordering.
  • Making AI-generated patches accurate and maintainable across diverse repository structures.

Accomplishments that we're proud of

  • Built a fully automated AI-driven DevOps pipeline.
  • Achieved end-to-end flow: CI failure → Confluent Kafka → AI analysis → patch → GitHub PR.
  • Designed a scalable multi-repository architecture powered by Firestore.
  • Built a real-time monitoring dashboard.
  • Significantly reduced debugging time for repetitive CI failures.

What we learned

  • Designing serverless AI architectures using Cloud Run and Vertex AI.
  • Building reliable real-time systems using Confluent Cloud.
  • Secure multi-cloud secret management.
  • Event-driven architecture with retries and fault tolerance.
  • Structuring repository metadata for scalable DevOps automation.

What's next for DevStreamAI

  • Add support for Jenkins, GitLab CI, and CircleCI.
  • Improve AI patch quality using repository embeddings and context memory.
  • Migrate from Firestore to BigQuery for analytics.
  • Add role-based access to the dashboard.
  • Implement automatic retry and self-healing workflows.

Built With

Share this project:

Updates