Delegate: Zero-Trust AI Agent Authorization

Bonus Blog Post: The Agentic Identity Crisis (and How Token Vault Solved It)

As developers in 2026, we are facing an "Agentic Identity Crisis."

We have incredible LLMs capable of executing complex, multi-step workflows. But enterprise adoption is stalled by a massive security flaw: to let an AI agent do its job, you traditionally have to hand it long-lived OAuth tokens. I experienced this fear firsthand. I previously built custom VSCode extensions just to back up my files before letting AI coding agents touch my codebase. The idea of an autonomous script holding a permanent credential with repo or gmail.send scopes was a security surface area I simply wasn't willing to risk.

When I entered the Authorized to Act hackathon, I dug into the new Auth0 Token Vault, and it completely inverted my mental model. Token Vault proved that the agent doesn't need to hold the keys; the agent just needs to borrow them for a few milliseconds under strict supervision.

However, implementing a true zero-trust agent brought a massive technical hurdle: The Agentic Latency Trap.

If my LangGraph agent needs permission to merge a Pull Request on GitHub, stopping the execution to redirect the user to a browser window for OAuth consent completely destroys the conversational flow and the LLM's context window. I needed asynchronous, out-of-band authorization.

The solution was combining Token Vault with Auth0 CIBA (Client-Initiated Backchannel Authentication). But wiring this up revealed a deep architectural challenge. When I attempted to nest the Auth0 CIBA and Token Vault decorators on a single LangChain StructuredTool, it crashed the backend. LangChain’s internal _parse_input validation threw a ValueError: InjectedToolCallId. The outer CIBA decorator was unintentionally stripping the metadata that the inner Vault decorator required.

Instead of abandoning the architecture, this hurdle forced me to deeply understand the underlying OAuth 2.0 protocols powering Auth0.

Working with the recommended patterns from the Auth0 engineering team, I engineered a cleaner composition. I utilized the auth0-ai-langchain CIBA decorator to natively pause the LangGraph thread and fire the Auth0 Guardian push notification to my phone. Then, inside the tool's execution block, I bypassed the SDK wrapper and manually invoked the Auth0 Token Exchange API (RFC 8693) using the urn:auth0:params:oauth:grant-type:token-exchange:federated-connection-access-token grant type.

It worked flawlessly. The agent pauses mid-thought. My phone buzzes. I tap "Approve." The graph resumes, the Token Vault dispenses a microscopic access token, the API call executes, and the token vanishes.

This journey taught me that Auth0 Token Vault isn't just a convenient credential store; it is the fundamental identity infrastructure required for the Agentic Web. By packaging my solution into a published PyPI library (delegate-manifest), I hope to show other developers that enterprise-grade, zero-trust AI agents aren't just a theory anymore—with Auth0, they are a reality you can build today.

---------------------------------- DEVPOST Project Story -----------------------------------------

🔥 Inspiration

I am afraid of giving AI agents access to my accounts.

Not abstractly afraid. Practically afraid. Before I trust any AI agent with file access, I manually back up my entire working directory — a habit I built after watching an early coding agent silently delete a configuration file I had spent three days writing. The agent thought it was cleaning up. It had no concept of what "important" meant. It had permanent write access, so it wrote. Permanently.

That incident changed how I work. I stopped using IDE-integrated AI agents entirely. Their permission model is binary: the agent either has access to your GitHub, your email, your filesystem — or it doesn't. There is no "access to this repository, for this action, approved by me, right now." I moved to isolated web apps to preserve that control. The productivity cost was real. The peace of mind was worth it.

When Auth0 announced Token Vault, I read the documentation twice.

Token Vault is not a credential manager. It is an ephemeral credential dispenser. The agent does not receive a token and hold it. Token Vault issues a scoped token for one API call, the call executes, the token expires. The agent never possessed the credential in any meaningful sense. And when Auth0 added CIBA — the ability to pause an AI agent mid-execution and push a Guardian notification to my phone for explicit approval — I realized we finally had the infrastructure to solve the root cause.

The root cause is not "AI agents are dangerous." The root cause is "AI agents have been given the wrong type of credential."

Long-lived API keys are the wrong type. OAuth tokens stored in databases are the wrong type. Both assume the agent can be trusted indefinitely with a persistent credential. Auth0 Token Vault is the first infrastructure that assumes the opposite as a default.

Delegate was built to prove that assumption is the right one.

🛡️ What it does

Delegate is a zero-trust multi-agent system where AI sub-agents act across Gmail, GitHub, and Dropbox — but through an explicit, revocable Auth0 Token Vault permission contract. The agent never holds an OAuth token. Every credential is dispensed at execution time and expires in seconds.

Three capabilities define the security model:

1. The AgentPermissionManifest — the live permission contract Every tool the agent can use is declared in a typed manifest. Tools are classified as safe (silent Token Vault fetch, no user interaction) or requires_stepup (Auth0 CIBA push to Guardian, explicit approval before any token is issued). The manifest is live and mutable — any tool can be disabled at runtime without restarting the agent.

2. The Chrome Extension Kill Switch — revocation from any tab The extension holds a persistent SSE connection to the FastAPI backend. When the agent attempts a dangerous action, the extension auto-opens via chrome.action.openPopup() and displays exactly what the agent wants to do — tool name, OAuth scope, LangGraph node ID, and a live TTL countdown. The user has two choices: "Reject this request" (soft, agent fails gracefully) or "Reject & Disable" (hard, tool removed from manifest permanently, cancellation message injected directly into the LangGraph thread before Auth0 finishes polling).

The extension is not a notification. It is a security control plane available from any browser tab, mid-execution, without navigating to a dashboard.

3. The Forensic Witness Pattern Audit Chain Every tool execution — whether a silent Token Vault fetch or a CIBA step-up approval — emits a SHA-256 chained receipt to delegate-audit.jsonl. Each receipt incorporates the previous entry's hash, the LangGraph node_id, the decision, and the timestamp. The chain is tamper-evident: changing any receipt breaks every subsequent hash. This proves exactly which agent node consumed which Token Vault credential, in what order, with what authorization decision.

[IMAGE PLACEHOLDER: Insert Mermaid PNG of High-Level Architecture Diagram showing FastAPI, LangGraph, and Auth0]

⚙️ How we built it

Delegate is built on a four-layer Auth0 identity pipeline where each SDK handles a distinct security concern with no overlap:

auth0-fastapi manages the HTTP session layer. User sessions are stored as HttpOnly chunked cookies (4KB chunk splitting handles the payload size that breaks standard session middleware). The Connected Accounts flow — linking Google, GitHub, and Dropbox to the user's Auth0 profile — is handled entirely by the mount_connected_account_routes config, powered by the My Account API with MRRT (Multi-Resource Refresh Tokens).

auth0-server-python manages the Connected Accounts lifecycle. ServerClient.list_connected_accounts() powers the live provider status panel in both the Svelte dashboard and the Chrome extension. For dangerous tools, auth0-server-python executes the RFC 8693 OAuth 2.0 Token Exchange call directly — exchanging the user's Auth0 refresh token for a scoped, ephemeral provider token. This is the call that Token Vault processes. It runs inside the tool function body, not as a decorator — a critical architectural decision explained in the Challenges section.

auth0-ai-langchain is the LangGraph integration layer. @with_token_vault silently fetches credentials for safe tools — the agent executes read_drive_doc, Token Vault issues a drive.readonly token scoped to the google_read LangGraph node, the document is read, the token expires. @with_async_authorization handles dangerous tools — it fires a CIBA bc-authorize request to Auth0, raises GraphInterrupt to pause the LangGraph thread with full state preserved, and hands control to GraphResumer. GraphResumer runs as a FastAPI lifespan asyncio task, polling Auth0 every 5 seconds for the user's Guardian approval decision, then automatically resuming the thread when the user approves.

The reasoning engine is Gemini 2.5 Flash via langchain-google-genai. The orchestration layer is LangGraph with a custom ManifestToolNode — a drop-in replacement for LangGraph's ToolNode that enforces the manifest gate before any Auth0 service is contacted. The frontend is Svelte 5 with a store controller pattern receiving real-time updates via Server-Sent Events. The backend is FastAPI with a single asyncio.Queue-based SSE fan-out — one connection, all subscribers, zero polling.

The core permission logic is packaged as delegate-manifest, a standalone Python library published to PyPI:

pip install delegate-manifest

Any LangGraph developer can integrate the full permission layer in one line:

from delegate_manifest.langgraph import ManifestToolNode

graph_builder.add_node("tools", ManifestToolNode(tools=[...], manifest=manifest))

💥 Challenges we ran into

The InjectedToolCallId Collision — A Security Best Practice Discovery

The most significant challenge became our most significant contribution to the Auth0 ecosystem.

The natural architectural pattern for a tool that requires both Token Vault credentials AND CIBA step-up approval is decorator composition:

# Logical. Expected. Crashes.
merge_pr = with_async_authorization(with_token_vault(StructuredTool(...)))

Both with_async_authorization and with_token_vault internally call tool_wrapper() from auth0-ai-langchain, which extends the tool's Pydantic args_schema with InjectedToolCallId via create_model. When stacked, the outer decorator's wrapper invokes the inner wrapped tool with a plain dict — but LangChain 0.3.x _parse_input() requires a full ToolCall object when InjectedToolCallId is present in the schema. The result is a runtime ValueError that surfaces only after Auth0 Guardian has already sent a push notification to the user's phone.

We traced the root cause to tool_wrapper.py line 23: return await tool.ainvoke(input, config) — passing input (a plain dict) instead of the full ToolCall object that the inner schema now expects.

We reported this on the Auth0 Discord and the Auth0 Engineering team confirmed our workaround is the recommended production pattern:

"your workaround is indeed the recommended approach: Use the with_async_authorization (CIBA) decorator as the primary wrapper to handle the graph interrupt and human approval step, then manually call the Token Exchange API from within your tool's function body to retrieve the necessary access token from Token Vault."

The architectural result is cleaner than the original pattern. CIBA and Token Vault are now separate, auditable steps. The CIBA decorator proves who approved this action and when. The RFC 8693 Token Exchange call inside the function proves which credential was used and on which node. Both are independently verifiable in the receipt chain. The fix at the SDK level is a one-line change in tool_wrapper.py — we have documented it in ARCHITECTURE.md as a contribution to the Auth0 open-source ecosystem.

🏆 Accomplishments that we're proud of

The delegate-manifest library is live on PyPI. The core permission layer — AgentPermissionManifest, ManifestToolNode, ReceiptChain — is framework-agnostic, Auth0-native, and available to any LangGraph developer today. This was not an afterthought. The library was extracted from the demo and published as a first-class deliverable because the pattern it implements — typed manifest enforcement before Auth0 contact — is immediately reusable across any agentic system using Token Vault.

The CI/CD pipeline ships on every merge. cloudbuild.yaml + deploy.sh containerize the FastAPI backend and deploy to Google Cloud Run automatically. The Svelte 5 dashboard deploys to Cloudflare Pages — global CDN, auto-HTTPS, ~45 seconds from push to globally available. The Chrome extension is bundled with esbuild and submitted to the Chrome Web Store. Zero manual deployment steps exist in the release process.

The CIBA interrupt preserves 100% of LangGraph thread state. This is not a trivial implementation detail. When GraphInterrupt fires, the LangGraph DelegateState — containing the full message history, all prior tool results, and the pending tool call — is serialized by the LangGraph Server checkpointer. When GraphResumer resumes the thread after Guardian approval, the agent continues exactly where it paused, with full context. The user's "Merge PR #42" request is fulfilled without the agent re-asking, re-planning, or losing track of what it was doing.

[IMAGE PLACEHOLDER: Insert Mermaid PNG of CIBA Sequence Diagram showing out-of-band approval]

📚 What we learned

We developed production-level expertise in OAuth 2.0 Token Exchange (RFC 8693) — the grant type at the heart of Token Vault. Understanding the exact fields (subject_token_type, requested_token_type, the federated connection grant), the Custom API Client requirements, and the difference between the refresh token and the resulting federated access token required reading the Auth0 documentation alongside the SDK source code simultaneously.

We learned how Auth0 CIBA preserves agentic context in a way that browser-redirect OAuth fundamentally cannot. CIBA is not a better login screen. It is a different authorization primitive — one designed for processes that cannot be interrupted by a browser redirect without losing state. Understanding this distinction changed how we think about identity in AI systems entirely.

We learned how to bind Auth0 federated tokens to specific LangGraph execution nodes via the node_id field in the Witness Pattern receipt chain. Every token that Token Vault issues is associated with the node that requested it — making the audit trail not just a history of actions, but a cryptographic proof of which agent component held which credential at which moment.

🚀 What's next for Delegate

The delegate-manifest library is production-shipped. The architectural pattern is proven. The next phase is developer evangelism at scale.

The production path for delegate-manifest is clear:

Dev.to technical blog series: "Why your LangGraph agent should never hold an OAuth token" — a step-by-step guide to integrating delegate-manifest and Auth0 Token Vault into any existing LangGraph agent. Written for the developer who is currently using long-lived API keys and knows it is wrong but has not found the production pattern yet.
LinkedIn technical posts: Documenting the CIBA architectural discovery — how asynchronous backchannel authentication makes real-time human-in-the-loop AI authorization possible without destroying the agent's context window. Targeting AI engineers who have encountered the Agentic Latency Trap and assumed it was unsolvable.
Auth0 Marketplace listing: delegate-manifest as an integration pattern in the Auth0 ecosystem documentation — giving every LangGraph + Auth0 developer a reference implementation they can fork and deploy.

The hard architectural work is complete. Auth0 Token Vault and CIBA made it possible. The library makes it repeatable. The next step is making sure every developer building agentic systems knows it exists.

Built With

auth0
chrome
fastapi
svelte5