Inspiration

CI/CD pipeline failures are common and often time-consuming to debug. We were inspired to build a system that not only analyzes errors but also learns from past failures, reducing repetitive debugging and improving developer productivity.

What it does

The project analyzes CI/CD error logs, identifies the root cause, suggests fixes, and stores past errors using embeddings. It retrieves similar previous failures to provide faster and smarter debugging assistance.

How we built it

We built a full-stack system using FastAPI for backend APIs and Streamlit for the frontend dashboard. Error logs are processed using a parser, converted into embeddings using Sentence Transformers, and stored for similarity-based retrieval. A rule-based system with optional LLM integration generates root cause analysis and fixes.

Challenges we ran into

We faced issues with environment setup, dependency conflicts, and integrating local LLMs reliably. Handling real-world noisy logs and ensuring the system remains stable without LLM dependency were also challenging.

Accomplishments that we're proud of

We successfully built a memory-based AI system that learns from past errors, provides accurate fixes, and works reliably even without external APIs. The system is fast, stable, and demo-ready.

What we learned

We gained hands-on experience in vector embeddings, semantic search, API development, and system design. We also learned how to build fault-tolerant AI systems and manage real-world DevOps challenges.

What's next for AI_DevOps_Failure_MemoryAgent

We plan to integrate real-time CI/CD monitoring, GitLab/GitHub integration, automated code fixes, and a more advanced LLM-based reasoning system for deeper analysis.

Built With

  • fastapi
  • ollama-(llm)
  • python
  • rest-apis
  • sentence-transformers
  • streamlit
  • vector-embeddings
Share this project:

Updates