Inspiration
CI/CD pipeline failures are common and often time-consuming to debug. We were inspired to build a system that not only analyzes errors but also learns from past failures, reducing repetitive debugging and improving developer productivity.
What it does
The project analyzes CI/CD error logs, identifies the root cause, suggests fixes, and stores past errors using embeddings. It retrieves similar previous failures to provide faster and smarter debugging assistance.
How we built it
We built a full-stack system using FastAPI for backend APIs and Streamlit for the frontend dashboard. Error logs are processed using a parser, converted into embeddings using Sentence Transformers, and stored for similarity-based retrieval. A rule-based system with optional LLM integration generates root cause analysis and fixes.
Challenges we ran into
We faced issues with environment setup, dependency conflicts, and integrating local LLMs reliably. Handling real-world noisy logs and ensuring the system remains stable without LLM dependency were also challenging.
Accomplishments that we're proud of
We successfully built a memory-based AI system that learns from past errors, provides accurate fixes, and works reliably even without external APIs. The system is fast, stable, and demo-ready.
What we learned
We gained hands-on experience in vector embeddings, semantic search, API development, and system design. We also learned how to build fault-tolerant AI systems and manage real-world DevOps challenges.
What's next for AI_DevOps_Failure_MemoryAgent
We plan to integrate real-time CI/CD monitoring, GitLab/GitHub integration, automated code fixes, and a more advanced LLM-based reasoning system for deeper analysis.
Built With
- fastapi
- ollama-(llm)
- python
- rest-apis
- sentence-transformers
- streamlit
- vector-embeddings
Log in or sign up for Devpost to join the conversation.