Inspiration - PLEASE TURN ON CAPTIONS ON YOUTUBE

Debugging is still one of the most manual and unstructured parts of software engineering.

While AI tools can autocomplete code, incident resolution still means:

Searching logs in Sentry or Vercel
Grepping through large repositories
Reading Confluence documentation
Coordinating in Slack threads
Manually tracking issue and PR state

We validated this with a survey of 10 engineers across startups and big tech:

50% spend 30 minutes to 2 hours on a typical bug
Nearly 50% spend 2+ hours on larger issues
44% identified environment or deployment mismatch as the biggest source of debugging time

That translates to roughly 20–30 hours per engineer per month spent debugging — or $4,000–$5,000 per engineer per month at typical startup salary costs.

What stood out wasn’t that bugs are difficult — it’s that they’re fragmented.

Engineers lose time stitching together context from:

Code
Logs
Documentation
Deployment environments
Slack conversations

We realized debugging isn’t primarily an intelligence problem — it’s a context aggregation and coordination problem.

So we set out to transform incident resolution from a chaotic Slack thread into a deterministic, observable workflow with explicit state transitions and full auditability.

What It Does

Buzz is an AI incident engineer that lives in Slack and synchronizes with GitHub.

When you mention @Buzz in Slack or open a GitHub issue, it:

Creates a structured Case
Initializes a deterministic lifecycle state
Syncs with GitHub issue metadata
Streams investigation events live in a Slack thread
Logs every transition and agent output

Each bug follows a transparent lifecycle:

NEW → TRIAGED → INVESTIGATING → REPORT_READY → PATCHING → PR_OPENED → REVIEW → RESOLVED

Instead of a black-box AI generating a PR, Buzz exposes:

Files inspected
Logs queried
Documentation referenced
Confidence levels
Patch reasoning
CI results

Slack becomes a real-time UI over a controlled backend workflow engine.

How We Built It

We built Buzz as a structured backend-first system, not a loose autonomous agent.

1. GitHub App + Verified Webhooks

We implemented a GitHub App with scoped permissions:

Issues: Read
Pull Requests: Read & Write
Contents: Read
Checks: Read

Webhook ingestion includes:

issues.opened
issue_comment
pull_request.*
check_run
workflow_run

Security measures:

HMAC SHA-256 signature verification
Installation ID–based scoped tokens
Immediate 200 response with async processing

All webhook events are processed asynchronously to avoid blocking and ensure reliability.

2. Deterministic Case State Machine

We implemented a strict state machine with validated transitions.

Illegal transitions are rejected
Every state change is persisted
Each transition includes metadata for replayability
Full audit logging per case

Each case stores:

Issue metadata
Agent outputs
Investigation artifacts
PR links
CI results
Timestamped state transitions

This prevents uncontrolled agent behavior and guarantees workflow consistency.

3. LangGraph-Based Agent Orchestration

We orchestrate the investigation pipeline using LangGraph as a DAG:

START
 → Triage
 → Codebase Search
 → Documentation Analysis
 → Log Analysis
 → Patch Generation
 → Report
 → END

Each node:

Receives structured state
Returns typed outputs
Emits real-time SSE events
Cannot mutate global state directly

This keeps agents deterministic and observable.

4. Codebase Intelligence with Nia

Instead of naive file search, we integrated Nia to:

Index full GitHub repositories
Perform semantic search
Retrieve relevant file snippets with line numbers
Connect Confluence documentation into the same search space

This eliminates context switching between:

Code
Docs
Historical knowledge

It directly addresses one of the biggest debugging bottlenecks: finding the correct file and understanding how components interact.

5. Log Correlation Engine

Buzz integrates:

Sentry API
Vercel runtime logs

We generate structured log queries from triage outputs, then:

Retrieve relevant error events
Analyze suspicious patterns
Build chronological timelines
Surface environment mismatches

From our survey:

44% of engineers said environment/deployment mismatch causes the most debugging time.

Buzz directly automates this step.

6. Patch Generation + Review Loop

The Patch Agent:

Generates minimal code diffs
Adds unit tests
Drafts structured PR descriptions
Creates feature branches via GitPython

We then:

Trigger CI workflows
Run CodeRabbit review automatically
Parse review feedback
Optionally iterate once for safe fixes

Every change is logged and linked.

No silent modifications.

Challenges We Ran Into

Maintaining Slack ↔ GitHub state consistency
Designing safe, validated state transitions
Preventing uncontrolled agent execution
Streaming real-time updates without race conditions
Making observability detailed but not overwhelming

We solved this by enforcing strict state control and structured agent outputs.

Accomplishments We're Proud Of

Built a fully validated deterministic state machine
Implemented bidirectional Slack ↔ GitHub sync
Designed Slack threads as structured investigation timelines
Logged every transition for auditability and replay
Integrated code, docs, logs, and review into a unified workflow

What We Learned

AI automation works best when constrained by explicit state and structure.

Deterministic workflows outperform loosely chained prompts.

Observability builds trust in autonomous systems.

Sponsors & Technologies We Used

Anthropic (Claude API)

We used Claude as our primary reasoning engine for:

Structured triage classification
Code reasoning
Log analysis
Patch generation

We used structured JSON prompts to enforce deterministic outputs.

NVIDIA (Nemotron)

We integrated Nemotron as an alternative LLM provider for experimentation and provider abstraction.

Our architecture supports switching models without changing workflow logic.

Nia (trynia.ai)

We used Nia for:

Repository indexing
Semantic code search
Confluence documentation integration

This significantly improved codebase retrieval accuracy compared to naive embedding search.

Sentry

Used for:

Production log ingestion
Error event querying
Timeline reconstruction

CodeRabbit

Used for:

Automated PR review
Structured feedback parsing
Iterative safe patch refinement

What's Next for Buzz

Since Nia is still growing, we will follow up with Nia closely on how to perfect our produce, such as integrating index for slack
Case memory and learning from past incidents
Recurring bug pattern detection
CI-aware patch validation
Cross-repo investigation intelligence
Expansion beyond Slack into broader engineering workflows

Built With

anthropic-claude
github-api
langgraph
lucide-react
motion
next.js
nia
prisma
python
radix-ui
react
slack
sqlite
tailwind-css
typescript

Submitted to

2026 Startup Week Buildathon

Created by

I worked on the frontend UI, especially the first page. It was my first time working on UI using GitHub and it was challenging at first, but the learning process extremely rewarding!

Noa Yaron
I built the frontend: the cases dashboard, the live investigation UI that streams agent progress in real time, and the intel report. The challenging part was the Slack integration: getting webhook events to reliably create cases and show up on the dashboard for users to investigate.

Coco Liu
I worked on the multi AI agent systems, had a lot of problems on connecting them and creating dependency graph

Alex Jia
I worked on the GitHub App, webhook system, and the case state machine. I had a lot of challenges setting up Slack and GitHub integrations and making all the different tools work together reliably.

eddieliu-dev

Updates

Noa Yaron started this project — Feb 22, 2026 11:58 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.