Inspiration

I kept seeing AI agent demos where a single agent has full access to everything -- GitHub, email, calendar, Slack, all with the same credentials. One bad tool call and it's game over. I wanted to build something where agents are structurally limited, not just politely asked to behave. When I saw Auth0's Token Vault and the "Authorized to Act" hackathon, the idea clicked: what if each agent could only get tokens for the services it actually needs?

What it does

DoughGuard is a multi-agent AI chat app with three agents, each holding different Auth0 Token Vault scopes:

  • Reader gets read-only access to GitHub, Calendar, and Slack. It can look at things but can't change anything.
  • Builder gets read+write access to GitHub only. It can create repos and file issues, but has zero access to Calendar or Slack.
  • Coordinator gets read+write access to Calendar and Slack only. It can schedule meetings and send messages, but can't touch code.

Bonus Blog Post: How Token Vault Solves the Confused Deputy Problem for AI Agents

This section is our entry for the Bonus Blog Post Prize (250+ words). It covers an architectural insight about Auth0 Token Vault that is materially different from the project description above.

The Real Problem Isn't Authentication -- It's Authorization Scope

When we started building DoughGuard, we assumed the hard part would be getting OAuth tokens from multiple providers. It wasn't. Auth0 Connected Accounts made that straightforward. The actual challenge was preventing a multi-agent system from collapsing into a single-privilege-level blob where every agent can do everything.

This is a classic confused deputy problem. In traditional software, a confused deputy is a program that gets tricked into misusing its authority -- it has permissions the caller shouldn't have, and the caller exploits that gap. In AI agent systems, the trick is even easier: the LLM hallucinates a tool call, or a prompt injection smuggles in an unauthorized action. If the agent holds write tokens for every service, the damage is immediate and silent.

Token Vault as an Architectural Boundary

Auth0 Token Vault gave us something we didn't expect: a natural enforcement layer for agent-level scope isolation. Each withTokenVault() wrapper from @auth0/ai-vercel binds a tool to a specific connection and scope set. When our Builder agent calls getAccessTokenFromTokenVault(), it can only receive a GitHub token with repo scope. It physically cannot obtain a Slack chat:write token, because that connection is not in its wrapper configuration.

This turns Token Vault from a credential storage mechanism into an authorization boundary. The scoping happens at the federated token exchange layer, before the tool function even executes. The agent's tool registry is shaped by its Token Vault connections -- tools that require connections the agent doesn't have simply don't get registered.

What We Learned

The key insight: in multi-agent systems, authorization should be structural, not behavioral. Don't tell the agent "you're not allowed to send Slack messages" in a system prompt and hope the LLM obeys. Instead, don't give it the tool. Auth0 Token Vault's connection-scoped token exchange makes this trivial -- each agent gets a different withTokenVault() configuration, and the enforcement happens at the infrastructure level, not the prompt level.

This pattern -- splitting Token Vault connections across agents by role -- is something we'd use in any production multi-agent system. It's the difference between "the agent promised not to" and "the agent can't."

When you send a message, a two-stage router (keyword matching first, then Gemini 2.0 Flash / Groq LLM fallback) figures out which agent handles it. Each agent only has tools for its authorized services. A real-time audit trail logs every routing decision, scope check, and token exchange as it happens. There's also a permission matrix that visually shows the access grid across all agents and services.

How we built it

The core stack is Next.js 15 with the Vercel AI SDK v6. Auth is handled by @auth0/nextjs-auth0 v4 with Universal Login and the Connected Accounts endpoint enabled. Each tool is wrapped with withTokenVault() from the @auth0/ai-vercel SDK, which handles the federated token exchange at runtime.

The routing system has two layers: a deterministic keyword classifier that catches obvious intents (like "create a repo" always goes to Builder), and an LLM classifier (Gemini 2.0 Flash primary, Groq fallback) for ambiguous requests. If nothing matches confidently, it defaults to Reader -- the safest option since it's read-only.

On the frontend, the permission matrix pulls from the same agent config that powers the backend tool registry, so the visual representation always matches the actual enforcement. The audit trail uses client-side event emission synced with server-side tool execution timestamps.

The whole thing runs on free tiers: Auth0 free plan, Groq free inference, Gemini free tier, Vercel hobby. Total infrastructure cost is $0.

Challenges we ran into

Auth0 Token Vault's Connected Accounts flow was the biggest hurdle. The federated token exchange (RFC 8693) requires a specific sequence: the user must authenticate, then explicitly link each service through the Connected Accounts flow before the agent can exchange tokens. Getting the timing right between login, connection, and token exchange took significant debugging.

We also hit issues with the @auth0/ai-vercel SDK's withTokenVault() wrapper requiring async context from setAIContext() that only works within the Vercel AI SDK's streamText pipeline. Calling wrapped tools manually (outside streamText) silently fails because the async local storage isn't initialized. We ended up using a hybrid approach: pre-execute tools with keyword matching and inject results into the system prompt, while keeping the wrapped tools available for the streaming pipeline.

The GitHub OAuth App also needed token expiration explicitly enabled before Auth0 could store refresh tokens -- GitHub issues non-expiring tokens by default, and Token Vault needs refresh tokens for the federated exchange.

Accomplishments that we're proud of

The permission denial is the feature I keep showing people. Lock the Builder agent, ask it to send a Slack message, and watch it refuse -- not because a prompt told it to, but because the sendSlackMessage tool literally doesn't exist in Builder's registry, and Token Vault won't issue a Slack token for that agent's configuration. That's structural enforcement, not behavioral.

The deterministic routing layer is also satisfying. Before adding it, the LLM would occasionally route "create a repo" to the Reader agent. Now keyword matching catches obvious intents first, and the LLM only handles genuinely ambiguous requests. Routing accuracy went from inconsistent to reliable.

Getting the entire stack running on $0 was a nice bonus. Free Auth0 tier, free Gemini and Groq inference, free Vercel hosting. A judge can clone the repo and run it without a credit card.

What we learned

The big lesson: in multi-agent systems, authorization should be structural, not behavioral. Putting "you are not allowed to send Slack messages" in a system prompt and hoping the LLM obeys is fundamentally different from not giving it the tool. Auth0 Token Vault makes structural enforcement practical because each withTokenVault() wrapper binds a tool to specific connections and scopes at the infrastructure level.

We also learned that Token Vault is really a two-step process that's easy to conflate. Step one is authentication (the user logs in). Step two is connection (the user explicitly links a service via Connected Accounts). The federated token exchange only works after both steps are complete. This distinction matters for UX -- you need to guide users through both.

Category-based access rules turned out to be more scalable than per-service rules. When we added Jira to the connection registry, Reader automatically got read access because Jira falls under "project-management." No code changes to the agent configuration. The category system handles the mapping.

What's next for DoughGuard

  • Human-in-the-loop for destructive operations: CIBA (Client-Initiated Backchannel Authentication) is already enabled on the Auth0 tenant. The next step is wiring it so that write operations (like deleting a repo) require explicit user approval via push notification before the agent can proceed.

  • Fine-grained authorization with Auth0 FGA: Moving beyond agent-level scope isolation to document-level and resource-level authorization. For example, Builder could have write access to specific repos, not all repos.

  • More live service integrations: Google Calendar, Slack, and additional providers connected through Token Vault with real OAuth flows.

  • Agent-to-agent delegation: When the Reader gets a write request, instead of just refusing, it could securely delegate to the Builder with an audited handoff, maintaining the permission chain.

Built With

  • ai-vercel
  • auth0-for-ai-agents
  • auth0-token-vault
  • gemini-2.0-flash
  • groq
  • next.js-15
  • nextjs-auth0
  • react-19
  • tailwindcss4
  • typescript
  • vercel-ai-sdk-v6
Share this project:

Updates