Inspiration

Most AI apps today interface with a single user at a time. Tools and RAG give the agent access to all the context it needs, but access to the agent itself is still fragmented by its 1:1 agent-user model. The agent cannot coordinate across multiple users, so even when we talk about AI-enabled teams, humans are still doing the work of passing AI's output around, in meetings, in shared documents, through private messages, to get work done.

But what if an agent could meet its users where they are and collaborate with an entire team? We’re confronted with the fundamental tension about identity: when an agent acts on behalf of a team, what identity does it act under? Personal access tokens mean one person owns every action the agent takes. Service accounts mean no one does. In high-stakes scenarios where a clear chain of accountability is required, both schemes fall short.

On engineering teams, production incidents remain a coordination and triage bottleneck, and they expose this very gap. When something goes wrong, engineers manually dig through Git history, one forms a hypothesis, opens a revert PR, awaits a review, merges, and updates the team. Each step requires a human to push it to the next. Any AI assistance is siloed within each individual team member’s chat session, and each step requires a human to push it to the next.

What it does

Department of Incidents is an agentic, human(s)-in-the-loop incident response platform. It investigates alerts autonomously and drives them toward resolution by working across multiple principals: on-call engineers, code owners, PR reviewers, with each action tied to the right human identity via Auth0 Token Vault.

When an Application Performance Monitoring (APM) alert fires, the agent wakes up with no active browser session or human prompting. It investigates the root cause using GitHub tools, spawning subagents to avoid context bloat, with every read operation running under the on-call engineer's identity via Token Vault. Its reasoning, tool calls, and output are streamed live to the dashboard visible to the whole team simultaneously.

Once it identifies the cause, it surfaces a request to open a PR inline in the activity stream. The on-call engineer approves, rejects, or offers feedback to steer the agent in real time. If they approve, the agent pulls in the code owner, who signs off under their own identity via the same Token Vault flow. Final merge approval routes back to the on-call engineer. The agent can then decide whether to post Slack updates, mark the incident closed, publish a post-mortem blog on the website, or continue gathering context if something still looks off.

The authorization model

Identity-chained authorization. During onboarding, engineers authorize Department of Incidents to their GitHub via the Connected Accounts flow, which stores their GitHub tokens with read:user and repo scopes in Token Vault, and their Auth0 refresh token is encrypted and stored in the application database. Each identity is then resolved at runtime per agentic tool call via Token Vault's token exchange endpoint at POST /oauth/token. The agent acts under whichever human is the appropriate principal for each action. The activity stream remains as an audit trail displaying who consented to it and in what capacity.

Intent-harness separation. The LLM never touches credentials. When the agent needs to act on GitHub, it emits an intent tool call. For example, for a diff fetch, it outputs the commit SHAs. The harness intercepts, fetches a GitHub access token via Token Vault’s RFC 8693 token exchange, executes the API call, and returns only the text result to the model. Sensitive data never enters the context window. This keeps the agent capable while keeping it blind to what it's authorized with, and makes the system auditable at the harness layer independently of whatever the LLM does.

How I built it

Engineers register services on the dashboard and get a unique URL plus a secret (shown once). Incidents are triggered via HMAC-verified webhooks from external observability tools, which begins the agentic loop. Each agent turn and tool call executes as a step in a durable workflow using the Workflow Development Kit (WDK). Steps are atomic, retried on failure, and safe from transient errors. High-stakes actions suspend execution via a WDK hook: the workflow awaits the hook, pauses without losing state, and resumes on approval. The app leverages Vercel Fluid Compute to run on serverless infrastructure and does not consume CPU hours while the workflow is suspended. If no approval arrives, a configurable timeout lets the agent proceed regardless.

The harness is built on top of the Vercel AI SDK, and the agent is powered by an LLM (through OpenRouter) and a range of tools and subagents, all configurable. The realtime architecture comprises of two separate channels. Discrete events such as tool calls are broadcast using Pusher. However, Pusher has a fire-and-forget model, so clients joining mid-stream would miss earlier chunks. This pointed to something like Kafka, where the producer writes to log that consumers read from. As it turns out, WDK's writable streams solves this more cleanly on a serverless setup with Server-Sent Events (SSE): clients pass a startIndex parameter indicating how much they've already seen and track their own position from there. A client joining mid-stream gets the full history of completed turns plus live chunks from exactly where the agent is, in correct order, with nothing missed.

For remediation, the agent can programmatically open revert pull requests without a local git binary. It performs a tree-level revert using the GitHub Data API: fetching parent trees in parallel, constructing a reverse diff map, processing commits newest-first so older state wins on conflicts, then creating a new tree, committing against HEAD, and opening the PR.

The agent’s output is persisted to Neon PostgreSQL database, forming a complete, replay-able activity stream. On the frontend, all incoming data is validated by Zod, and then a custom generative UI layer transforms it into a mix of React client and server components, rendering personalized views for each event or user. For example, if the agent needs to use someone’s identity but their token was revoked, the target user sees an inline reauthorization request, while everyone else sees Waiting for @${engineerHandle} to re-authorize. If at the end it decides to generate a post-mortem, the blog post is made available on the website under a Next.js dynamic route.

Challenges

Building a clear security model required making product decisions. For example: should users be able to change the on-call engineer, the person whose identity is used for read operations, mid-incident? Architecturally it's a feature toggle, but changing ownership mid-investigation splits the audit trail. Who investigated under which identity, and why did that change? I decided against it. The on-call engineer is fixed at registration, when accountability is established. The next incident assigns ownership to whoever is on-call then.

Secondly, every operation in the agentic loop had to be decomposed into a "use step" function, including Pusher event emissions, to ensure persistence and delivery in the same atomic unit. Side effects not wrapped in their own step can fire more than once, so deciding which effects are safe to repeat and which need their own step required rethinking the loop partway through. The "use step" boundary is the main correctness guarantee the system relies on.

Accomplishments

Department of Incidents proposes a 1:N model for AI agents that opens up new avenues for human and AI collaboration. In the last few years, AI has been a multiplying force in various industries, but either its usual chat based interfaces have been limiting its potential, or in high stakes scenarios, questions about ownership over agents’ actions have throttled adoption. Token Vault’s primitives enable agentic coordination without impersonation in a form that can be generalized to build human(s)-in-the-loop systems for any domain beyond software engineering. The architecture of Department of Incidents works for any multi-stakeholder, agentic, asynchronous workflow, and incident response is just its first form.

What I learned

When I started working on this project, I was wondering if instead of trading off against each other, an agent's capabilities and its user's oversight can scale together. For Department of Incidents, the question became, can the agent be the always-on teammate, but without it doing things its human teammates don't consent to?

Approval is what confers identity.

The agent nominates an engineer when it calls the merge_pr tool, but the GitHub action executes under whoever actually approves. Although the acting engineer’s GitHub username is known at invocation, their token is resolved only after the hook resumes. Approval is what triggers the token exchange, so the authority to act as that person only materializes at that moment. The agent's ability to act under real human identities and the team's ability to oversee those actions turned out to be the same mechanism. Every GitHub action has a human author, and that author is the person who said yes.

Token timing is a structural problem in durable workflows.

When a protected tool is called, the workflow suspends until a human approves, potentially for days or weeks. A GitHub access token fetched before that pause will be expired by the time the workflow resumes. The fix is to fetch tokens inside the execution step, after approval is granted, not before the suspension. Durable workflow frameworks don't surface this as a constraint; you discover it when an approved action starts failing for no obvious reason.

What's next

On engineering teams, context about code can be scattered outside code, across internal documents, in private messages, on observability tools. Adding tools and integrations to Department of Incidents that allows the agent to gather context from more sources, or leveraging the Agent-to-Agent (A2A) protocol, would significantly improve its triage confidence. A Slack-based chat input is also worth exploring, so that engineers could offer context to the live incident agent to facilitate triaging, or perhaps talk to a higher level agent that possesses context spanning all incidents, and it could help answer questions or unearth patterns based on past incidents. Automatically setting up GitHub webhooks would also be a step forward and would make the agent reactive to PR events on GitHub directly.

Bonus Blog Post

Does Your Agent Want to See Other People?

My hackathon project began more with constraints than with a feature set. I wanted to build an incident response platform with a background agent that worked autonomously but also kept multiple humans in the loop. It's not a difficult problem that the harness design can't solve, but what do you do when your agent wants not just to talk to them but to act as them as well?

If your agent is calling external APIs, this becomes a security and isolation problem that's more error prone than it looks. You need to handle token expiry, rotation, and MFA policies, and you quickly start to see the surface area for errors and security gaps to sneak through.

I didn't have to though. Auth0's guides and its built-in connector made understanding and implementing GitHub authorization easy, and with an ergonomic SDK and flexible primitives, Token Vault did what it was supposed to: take the hard part off my plate.

I configured my Auth0 application, GitHub app integration, and My Account API, and then hit what I thought was a dead end: the SDK would only exchange the currently active user's token, which wouldn't work for a background agent. The docs explained the underlying API call the SDK was wrapping, and I realized I could do the same thing with a stored Auth0 token.

I spent the rest of my hackathon hours on Department of Incidents’ realtime architecture and the generative UI engine. Token Vault's API was flexible enough to handle runtime federated access token acquisition scoped per action, per user, without much ceremony. The agent's commits, approvals, and merges go out under the approving user's GitHub identity because their federated access token is what gets used.

My agent sees other people, but Token Vault made that the easy part.

Built With

Share this project:

Updates