Inspiration

We know there are many tools that can alert you to errors and crashes. For example, Incident.io lets you know when there's a bug on the production site. We wanted something better. We wanted the report to be as useful as possible, so developers can avoid wasting time staring at the codebase to find the root cause.

We built an AI-driven tool to generate reports on the crash before anyone notices, and announce it on Slack. The report doesn't only contain the bug and error type, but also the commit history with the top committers (who are most responsible for the code!), diffs and an AI-generated summary of the error. All developers will be notified as soon as the crash happens, and everyone knows who to contact.

What it does

Incident.ai is an automated incident triage system that turns raw crash stacktraces into actionable alerts in seconds:

  1. Receives crash reports straight to your workplace. (Slack...more integration coming)
  2. Analyses stacktraces with Claude AI to explain what went wrong in plain English
  3. Traces incidents to Git commits by fetching recent code changes from GitHub
  4. Identifies top contributors who worked on the affected files

Instead of manually digging through logs and Git history, teams get instant insights: the crash reason, recent commits that might have caused it, and who to contact—all in one Slack message.

How we built it

Stack:

  • Backend: Node.js + Express for the API server
  • AI: Claude (via Anthropic SDK + Vercel AI SDK) for stacktrace analysis
  • Integrations: GitHub API for commit forensics, Slack webhooks for notifications
  • Testing: Custom test runner with colored console output

Architecture:

  • RESTful API with /api/incident endpoint to receive stacktraces
  • AI service layer using Claude Haiku for cost-effective text generation
  • GitHub service with functions for:
    • getTopCommitsWithDiffs() - fetch recent commits with diffs for specific files
    • getTopAuthorsForFile() - identify developers who worked on affected code
    • getFileCommitHistory() - trace code changes over time
  • Slack integration for real-time team notifications
  • Middleware for request tracking and centralised error handling

Challenges we ran into

  1. Parsing stacktraces to identify files - Stacktraces from different languages/frameworks have wildly different formats. We focused on getting the AI to extract meaningful information regardless of format.
  2. GitHub API rate limits - Initially hit rate limits while testing. Implemented GitHub token authentication and optimised API calls to fetch only necessary data.
  3. AI prompt engineering - Tuning the AI to provide concise yet detailed crash analysis required iteration. Too verbose and developers ignore it; too brief and it lacks context.

Accomplishments that we're proud of

  • End-to-end incident pipeline from stacktrace ingestion to Slack notification working
  • AI-powered crash analysis that explains technical errors in clear language
  • Robust GitHub integration with functions to fetch commits, diffs, and author statistics
  • Clean architecture with separation of concerns (routes, controllers, services)
  • Custom test framework with colored output and comprehensive test coverage
  • Production-ready error handling with HTTP error classes and request ID tracking

AND a suite of additional tools built include:

  • Incident dashboard for monitoring org-wide incidents.
  • VS Code extension to help identify "code owners".

What we learned

  • AI is transformative for DevOps - Claude's ability to parse and explain complex stacktraces is surprisingly accurate and saves massive amounts of time
  • GitHub's API is powerful - We can extract incredibly useful forensic data (diffs, commit frequency by author, file change history) to pinpoint when bugs were introduced.
  • Small integrations, big impact - Connecting just three services (AI, GitHub, Slack) creates a tool that could genuinely change how teams handle incidents

What's next for Incident.ai

  • Integration with Jira or Linear.
  • A.I powered auto-commit fix with human oversight.
  • Learning feedback loop. Track which commits actually fixed crashes to improve future predictions.

Built With

Share this project:

Updates