Inspiration - PLEASE TURN ON CAPTIONS ON YOUTUBE

Debugging is still one of the most manual and unstructured parts of software engineering.

While AI tools can autocomplete code, incident resolution still means:

  • Searching logs in Sentry or Vercel
  • Grepping through large repositories
  • Reading Confluence documentation
  • Coordinating in Slack threads
  • Manually tracking issue and PR state

We validated this with a survey of 10 engineers across startups and big tech:

  • 50% spend 30 minutes to 2 hours on a typical bug
  • Nearly 50% spend 2+ hours on larger issues
  • 44% identified environment or deployment mismatch as the biggest source of debugging time

That translates to roughly 20–30 hours per engineer per month spent debugging — or $4,000–$5,000 per engineer per month at typical startup salary costs.

What stood out wasn’t that bugs are difficult — it’s that they’re fragmented.

Engineers lose time stitching together context from:

  • Code
  • Logs
  • Documentation
  • Deployment environments
  • Slack conversations

We realized debugging isn’t primarily an intelligence problem — it’s a context aggregation and coordination problem.

So we set out to transform incident resolution from a chaotic Slack thread into a deterministic, observable workflow with explicit state transitions and full auditability.


What It Does

Buzz is an AI incident engineer that lives in Slack and synchronizes with GitHub.

When you mention @Buzz in Slack or open a GitHub issue, it:

  • Creates a structured Case
  • Initializes a deterministic lifecycle state
  • Syncs with GitHub issue metadata
  • Streams investigation events live in a Slack thread
  • Logs every transition and agent output

Each bug follows a transparent lifecycle:

NEW → TRIAGED → INVESTIGATING → REPORT_READY → PATCHING → PR_OPENED → REVIEW → RESOLVED

Instead of a black-box AI generating a PR, Buzz exposes:

  • Files inspected
  • Logs queried
  • Documentation referenced
  • Confidence levels
  • Patch reasoning
  • CI results

Slack becomes a real-time UI over a controlled backend workflow engine.


How We Built It

We built Buzz as a structured backend-first system, not a loose autonomous agent.


1. GitHub App + Verified Webhooks

We implemented a GitHub App with scoped permissions:

  • Issues: Read
  • Pull Requests: Read & Write
  • Contents: Read
  • Checks: Read

Webhook ingestion includes:

  • issues.opened
  • issue_comment
  • pull_request.*
  • check_run
  • workflow_run

Security measures:

  • HMAC SHA-256 signature verification
  • Installation ID–based scoped tokens
  • Immediate 200 response with async processing

All webhook events are processed asynchronously to avoid blocking and ensure reliability.


2. Deterministic Case State Machine

We implemented a strict state machine with validated transitions.

  • Illegal transitions are rejected
  • Every state change is persisted
  • Each transition includes metadata for replayability
  • Full audit logging per case

Each case stores:

  • Issue metadata
  • Agent outputs
  • Investigation artifacts
  • PR links
  • CI results
  • Timestamped state transitions

This prevents uncontrolled agent behavior and guarantees workflow consistency.


3. LangGraph-Based Agent Orchestration

We orchestrate the investigation pipeline using LangGraph as a DAG:

START
 → Triage
 → Codebase Search
 → Documentation Analysis
 → Log Analysis
 → Patch Generation
 → Report
 → END

Each node:

  • Receives structured state
  • Returns typed outputs
  • Emits real-time SSE events
  • Cannot mutate global state directly

This keeps agents deterministic and observable.


4. Codebase Intelligence with Nia

Instead of naive file search, we integrated Nia to:

  • Index full GitHub repositories
  • Perform semantic search
  • Retrieve relevant file snippets with line numbers
  • Connect Confluence documentation into the same search space

This eliminates context switching between:

  • Code
  • Docs
  • Historical knowledge

It directly addresses one of the biggest debugging bottlenecks: finding the correct file and understanding how components interact.


5. Log Correlation Engine

Buzz integrates:

  • Sentry API
  • Vercel runtime logs

We generate structured log queries from triage outputs, then:

  • Retrieve relevant error events
  • Analyze suspicious patterns
  • Build chronological timelines
  • Surface environment mismatches

From our survey:

44% of engineers said environment/deployment mismatch causes the most debugging time.

Buzz directly automates this step.


6. Patch Generation + Review Loop

The Patch Agent:

  • Generates minimal code diffs
  • Adds unit tests
  • Drafts structured PR descriptions
  • Creates feature branches via GitPython

We then:

  • Trigger CI workflows
  • Run CodeRabbit review automatically
  • Parse review feedback
  • Optionally iterate once for safe fixes

Every change is logged and linked.

No silent modifications.


Challenges We Ran Into

  • Maintaining Slack ↔ GitHub state consistency
  • Designing safe, validated state transitions
  • Preventing uncontrolled agent execution
  • Streaming real-time updates without race conditions
  • Making observability detailed but not overwhelming

We solved this by enforcing strict state control and structured agent outputs.


Accomplishments We're Proud Of

  • Built a fully validated deterministic state machine
  • Implemented bidirectional Slack ↔ GitHub sync
  • Designed Slack threads as structured investigation timelines
  • Logged every transition for auditability and replay
  • Integrated code, docs, logs, and review into a unified workflow

What We Learned

AI automation works best when constrained by explicit state and structure.

Deterministic workflows outperform loosely chained prompts.

Observability builds trust in autonomous systems.


Sponsors & Technologies We Used

Anthropic (Claude API)

We used Claude as our primary reasoning engine for:

  • Structured triage classification
  • Code reasoning
  • Log analysis
  • Patch generation

We used structured JSON prompts to enforce deterministic outputs.


NVIDIA (Nemotron)

We integrated Nemotron as an alternative LLM provider for experimentation and provider abstraction.

Our architecture supports switching models without changing workflow logic.


Nia (trynia.ai)

We used Nia for:

  • Repository indexing
  • Semantic code search
  • Confluence documentation integration

This significantly improved codebase retrieval accuracy compared to naive embedding search.


Sentry

Used for:

  • Production log ingestion
  • Error event querying
  • Timeline reconstruction

CodeRabbit

Used for:

  • Automated PR review
  • Structured feedback parsing
  • Iterative safe patch refinement

What's Next for Buzz

  • Since Nia is still growing, we will follow up with Nia closely on how to perfect our produce, such as integrating index for slack
  • Case memory and learning from past incidents
  • Recurring bug pattern detection
  • CI-aware patch validation
  • Cross-repo investigation intelligence
  • Expansion beyond Slack into broader engineering workflows

Built With

Share this project:

Updates