Inspiration - PLEASE TURN ON CAPTIONS ON YOUTUBE
Debugging is still one of the most manual and unstructured parts of software engineering.
While AI tools can autocomplete code, incident resolution still means:
- Searching logs in Sentry or Vercel
- Grepping through large repositories
- Reading Confluence documentation
- Coordinating in Slack threads
- Manually tracking issue and PR state
We validated this with a survey of 10 engineers across startups and big tech:
- 50% spend 30 minutes to 2 hours on a typical bug
- Nearly 50% spend 2+ hours on larger issues
- 44% identified environment or deployment mismatch as the biggest source of debugging time
That translates to roughly 20–30 hours per engineer per month spent debugging — or $4,000–$5,000 per engineer per month at typical startup salary costs.
What stood out wasn’t that bugs are difficult — it’s that they’re fragmented.
Engineers lose time stitching together context from:
- Code
- Logs
- Documentation
- Deployment environments
- Slack conversations
We realized debugging isn’t primarily an intelligence problem — it’s a context aggregation and coordination problem.
So we set out to transform incident resolution from a chaotic Slack thread into a deterministic, observable workflow with explicit state transitions and full auditability.
What It Does
Buzz is an AI incident engineer that lives in Slack and synchronizes with GitHub.
When you mention @Buzz in Slack or open a GitHub issue, it:
- Creates a structured Case
- Initializes a deterministic lifecycle state
- Syncs with GitHub issue metadata
- Streams investigation events live in a Slack thread
- Logs every transition and agent output
Each bug follows a transparent lifecycle:
NEW → TRIAGED → INVESTIGATING → REPORT_READY → PATCHING → PR_OPENED → REVIEW → RESOLVED
Instead of a black-box AI generating a PR, Buzz exposes:
- Files inspected
- Logs queried
- Documentation referenced
- Confidence levels
- Patch reasoning
- CI results
Slack becomes a real-time UI over a controlled backend workflow engine.
How We Built It
We built Buzz as a structured backend-first system, not a loose autonomous agent.
1. GitHub App + Verified Webhooks
We implemented a GitHub App with scoped permissions:
- Issues: Read
- Pull Requests: Read & Write
- Contents: Read
- Checks: Read
Webhook ingestion includes:
issues.openedissue_commentpull_request.*check_runworkflow_run
Security measures:
- HMAC SHA-256 signature verification
- Installation ID–based scoped tokens
- Immediate 200 response with async processing
All webhook events are processed asynchronously to avoid blocking and ensure reliability.
2. Deterministic Case State Machine
We implemented a strict state machine with validated transitions.
- Illegal transitions are rejected
- Every state change is persisted
- Each transition includes metadata for replayability
- Full audit logging per case
Each case stores:
- Issue metadata
- Agent outputs
- Investigation artifacts
- PR links
- CI results
- Timestamped state transitions
This prevents uncontrolled agent behavior and guarantees workflow consistency.
3. LangGraph-Based Agent Orchestration
We orchestrate the investigation pipeline using LangGraph as a DAG:
START
→ Triage
→ Codebase Search
→ Documentation Analysis
→ Log Analysis
→ Patch Generation
→ Report
→ END
Each node:
- Receives structured state
- Returns typed outputs
- Emits real-time SSE events
- Cannot mutate global state directly
This keeps agents deterministic and observable.
4. Codebase Intelligence with Nia
Instead of naive file search, we integrated Nia to:
- Index full GitHub repositories
- Perform semantic search
- Retrieve relevant file snippets with line numbers
- Connect Confluence documentation into the same search space
This eliminates context switching between:
- Code
- Docs
- Historical knowledge
It directly addresses one of the biggest debugging bottlenecks: finding the correct file and understanding how components interact.
5. Log Correlation Engine
Buzz integrates:
- Sentry API
- Vercel runtime logs
We generate structured log queries from triage outputs, then:
- Retrieve relevant error events
- Analyze suspicious patterns
- Build chronological timelines
- Surface environment mismatches
From our survey:
44% of engineers said environment/deployment mismatch causes the most debugging time.
Buzz directly automates this step.
6. Patch Generation + Review Loop
The Patch Agent:
- Generates minimal code diffs
- Adds unit tests
- Drafts structured PR descriptions
- Creates feature branches via GitPython
We then:
- Trigger CI workflows
- Run CodeRabbit review automatically
- Parse review feedback
- Optionally iterate once for safe fixes
Every change is logged and linked.
No silent modifications.
Challenges We Ran Into
- Maintaining Slack ↔ GitHub state consistency
- Designing safe, validated state transitions
- Preventing uncontrolled agent execution
- Streaming real-time updates without race conditions
- Making observability detailed but not overwhelming
We solved this by enforcing strict state control and structured agent outputs.
Accomplishments We're Proud Of
- Built a fully validated deterministic state machine
- Implemented bidirectional Slack ↔ GitHub sync
- Designed Slack threads as structured investigation timelines
- Logged every transition for auditability and replay
- Integrated code, docs, logs, and review into a unified workflow
What We Learned
AI automation works best when constrained by explicit state and structure.
Deterministic workflows outperform loosely chained prompts.
Observability builds trust in autonomous systems.
Sponsors & Technologies We Used
Anthropic (Claude API)
We used Claude as our primary reasoning engine for:
- Structured triage classification
- Code reasoning
- Log analysis
- Patch generation
We used structured JSON prompts to enforce deterministic outputs.
NVIDIA (Nemotron)
We integrated Nemotron as an alternative LLM provider for experimentation and provider abstraction.
Our architecture supports switching models without changing workflow logic.
Nia (trynia.ai)
We used Nia for:
- Repository indexing
- Semantic code search
- Confluence documentation integration
This significantly improved codebase retrieval accuracy compared to naive embedding search.
Sentry
Used for:
- Production log ingestion
- Error event querying
- Timeline reconstruction
CodeRabbit
Used for:
- Automated PR review
- Structured feedback parsing
- Iterative safe patch refinement
What's Next for Buzz
- Since Nia is still growing, we will follow up with Nia closely on how to perfect our produce, such as integrating index for slack
- Case memory and learning from past incidents
- Recurring bug pattern detection
- CI-aware patch validation
- Cross-repo investigation intelligence
- Expansion beyond Slack into broader engineering workflows
Built With
- anthropic-claude
- github-api
- langgraph
- lucide-react
- motion
- next.js
- nia
- prisma
- python
- radix-ui
- react
- slack
- sqlite
- tailwind-css
- typescript

Log in or sign up for Devpost to join the conversation.