Crash Copilot

💡 Inspiration

Every developer knows the pain of a broken CI/CD pipeline. You get an alert, you open the terminal, and you spend 45 minutes scrolling through a wall of messy, unformatted stderr spaghetti just to find a missing comma. We realized that while AI coding assistants are great for writing code, they suck at catching automated crashes. We wanted to build an autonomous Site Reliability Engineer (SRE)—an agent that lives in the pipeline, catches the crash, and hands you the fix before you even open your IDE.

⚙️ What it does

Crash-Copilot Cloud is an automated AI debugging agent. It invisibly intercepts failed execution commands or broken CI/CD builds (via webhooks). Instead of just spitting out an error code, it extracts the exact broken file, isolates the context, and feeds it into an LLM reasoning engine.

Within seconds, it delivers a stunning, interactive HTML report (or a Slack/Discord message) containing:

The root cause of the crash.
The exact rewritten code block (with a 1-click copy button).
A persistent chat companion pre-loaded with the crash context for follow-up questions.

🏗️ How we built it

We engineered this with scalability and enterprise adoption in mind:

The Pipeline: We utilized n8n to create a highly scalable, low-code webhook architecture. This allows Crash-Copilot to seamlessly catch payloads from GitHub Actions or local execution simulators.
The Brain: The core reasoning engine is powered by Z.AI's GLM-5.1. We chose GLM-5.1 for its massive context window and rapid response times.
Model Agnostic (BYOM): We built the backend to be flexible. Enterprise teams can easily swap out GLM-5.1 for OpenAI, Anthropic, or local open-source models to maintain strict data privacy.
The Parser: Custom Python Regex algorithms cleanly extract file paths and tracebacks from raw terminal outputs.

⚠️ Challenges we ran into

The hardest part was wrangling the LLM output. Early versions of the AI would give us a rambling 5-paragraph essay about the error instead of just fixing the code. We had to do some heavy prompt engineering and strict JSON formatting to force the AI to return only the specific fixed code block and a concise root-cause bullet point.

🏆 Accomplishments that we're proud of

We took a fragmented, manual debugging process and turned it into a fully automated cloud-native pipeline. Getting the n8n webhook to seamlessly catch a local Python crash and return a formatted HTML UI in under 5 seconds felt like magic.

🚀 What's next for Crash-Copilot

Our immediate next step is Auto-Remediation. We want to connect Crash-Copilot directly to the GitHub API so that when a build fails, the agent doesn't just notify the team—it automatically opens a Pull Request with the fixed code attached.