🚀 RepoPilot — Vectorized CI/CD Intelligence System
Transforming CI/CD from reactive debugging into a continuously learning system using AI agents + semantic memory.
📓 Abstract
CI/CD pipelines repeatedly fail for familiar reasons, yet most systems treat each failure as new.
RepoPilot introduces a semantic memory layer that allows pipelines to learn from past fixes.
By combining AI agents with vector-based retrieval, CI debugging becomes faster, smarter, and increasingly autonomous.
📍 The Problem — CI Has No Memory
In most development teams, CI failures follow a predictable pattern:
- A build fails
- A developer investigates
- The issue is fixed
- The solution disappears into commit history
Weeks later, a similar failure appears again.
- The logs exist
- The fix exists
- But the connection between them is lost
CI systems detect failures efficiently — but they do not understand them.
That gap is where RepoPilot begins.
🎯 Core Idea — Create a Semantic Memory Layer
Instead of treating CI logs as disposable text, RepoPilot stores:
- Raw CI failure logs
- AI-generated root cause analysis
- Code changes that fixed the issue
- Repository metadata
- Pull request links
- Timestamp
When a new build fails, RepoPilot searches past failures by meaning — not just keywords.
🤖 What is RepoPilot?
RepoPilot is an AI-powered CI/CD fix agent that:
- 🎧 Listens to GitHub webhook events
- 🔍 Monitors CI workflow failures
- 🤖 Analyzes logs using AI agents
- 🛠 Generates code fixes
- 🔁 Opens pull requests automatically
- 🧠 Stores failure + fix history in Elasticsearch
This transforms CI from reactive debugging into a continuously learning system.
🧠 Core Innovation — Semantic Failure Memory
Each CI failure is stored as a structured document containing:
error_text(semantic_text field)- AI-generated root cause analysis
- Code changes that fixed the issue
- Repository metadata
- Pull request URL
- Timestamp
Instead of keyword matching, RepoPilot uses semantic similarity search.
This means it can retrieve failures that mean the same thing — even if the logs differ syntactically.
🏗 Architecture Overview
🔹 Step 1 — GitHub Webhook Trigger
When a workflow fails:
- GitHub sends a webhook event
- RepoPilot receives it via FastAPI
- Logs and metadata are collected
🔹 Step 2 — AI Analysis with CrewAI
RepoPilot runs CrewAI agents to:
- Parse CI logs
- Identify root cause
- Generate patch suggestions
- Provide fix explanation
This makes the response structured and actionable.
🔹 Step 3 — Semantic Indexing in Elasticsearch
The failure is stored using a semantic_text field.
This enables:
- Automatic vector embeddings
- Meaning-based similarity matching
- Fast top-K retrieval of related past failures
🔹 Step 4 — Retrieval on New Failure
When a new build fails:
- Logs are trimmed and submitted as a semantic query
- Elasticsearch returns top similar fixes
- RepoPilot uses those results to improve fix generation
The more failures stored, the smarter the system becomes.
This creates a continuous learning feedback loop.
⚙️ Tech Stack
- Backend: FastAPI
- AI Agents: CrewAI
- Vector Search: Elasticsearch (semantic_text)
- Integration: GitHub App + Webhooks
- Automation: Pull Request generation
🏆 Key Achievements
- Built a fully functional GitHub App
- Implemented AI agent-based CI log analysis
- Designed a semantic memory architecture
- Enabled automated PR generation from CI failures
- Created a self-improving debugging loop
🚀 Future Improvements
- CI-specific fine-tuned embeddings
- Cross-repository failure intelligence
- Slack / Teams integration
- Patch confidence scoring
- Enterprise dashboard analytics
🔗 Links
🔗 GitHub Repository
https://github.com/Jaga0001/repo_pilot🔗 GitHub App
https://github.com/apps/repo-pilot
🧠 Vision
RepoPilot turns CI from a reactive system into a learning organism.
Instead of asking:
"Why did this fail again?"
The system responds:
"I've seen this before. Here's the fix."
⭐ If you like this project, consider giving it a star!
Built With
- agents
- crewai
- elasticsearch
- fastapi
- git
- github
- python
- render
Log in or sign up for Devpost to join the conversation.