-
-
Overview Dashboard — Real-time KPIs, OWASP Agentic Top 10 coverage, behavioral anomaly score, and token exchange timeline
-
Agent Chat — Security-aware AI agent with live tool calls, FGA authorization, and event sidebar
-
Observatory Audit Trail — Full post-authentication event log with risk classification and OWASP mapping
-
OWASP Risk Map — All 10 Agentic Application risks (ASI01–ASI10) with detection status
-
Permission Landscape — Agent-to-service scope mapping with granular FGA toggle controls
-
Token Vault Debugger — Per-connection health scores, configuration checklists, and known error reference
-
Credential-Event Correlation — Token exchanges paired with consuming tool calls (Pattern 1)
Inspiration
At RSAC 2026, five major vendors shipped agent identity frameworks. None of them tracked what the agent does after authentication succeeds. The OWASP Top 10 for Agentic Applications catalogs ten risks that emerge in exactly this post-authentication blind spot. We built Agent Observatory to fill that gap.
What it does
Agent Observatory makes every post-authentication AI agent action observable, auditable, and controllable — using Auth0's existing identity primitives. It implements three authorization patterns:
Pattern 1: Credential-Event Correlation — Every token exchange via Token Vault is logged alongside the tool call that consumed it. For any time window, you can answer: "Which tool calls used credentials from Google Calendar in the last hour?"
Pattern 2: Scope-Bound Risk Classification — OAuth scopes encode what an agent can do. We classify each scope into risk tiers and map them to OWASP Agentic categories (ASI01–ASI10). calendar.freebusy carries lower risk than chat:write.
Pattern 3: Interrupt-as-Circuit-Breaker — When a write operation is requested, a server-side circuit breaker blocks execution. The agent must call a step-up authorization tool, present the risk to the user, and wait for explicit confirmation. This is enforced server-side with a 5-minute TTL registry — even a jailbroken system prompt cannot bypass it.
How we built it
- Auth0 Token Vault: Tools wrapped with
withTokenVault()for RFC 8693 token exchange.withInterruptions()handles missing credentials by surfacing Connect dialogs. Management API fallback for providers without refresh tokens. - Token Vault Debugger: Per-connection health scores, exchange timelines, configuration checklists, and references to known issues (auth0-ai-samples#66, auth0-ai-js#175).
- Observatory Dashboard: Real-time event stream, risk distribution, OWASP coverage map, credential-event correlation, and behavioral anomaly detection (velocity, cross-service escalation, scope escalation, error bursts).
- FGA Authorization Model: In-memory ReBAC using OpenFGA concepts — service-level permissions and granular scope-level toggles.
Challenges we ran into
- Token Vault configuration complexity — The 10-step setup process with uninformative errors (
Federated connection Refresh Token not found) was the #1 pain point. This directly inspired our Token Vault Debugger. - Silent error swallowing — Auth0 AI SDK (auth0-ai-js#175) discards federated connection errors silently. We built an error capture layer around every tool call.
- Provider-specific token behavior — GitHub OAuth Apps don't issue refresh tokens, Slack identity linking loses tokens. We built a Management API fallback path.
Accomplishments that we're proud of
- Server-side step-up enforcement that is jailbreak-proof
- Full OWASP Top 10 for Agentic Applications coverage with real-time visualization
- Token Vault Debugger that addresses the #1 developer pain point
- 6 actionable feedback items submitted to Auth0 product team
What we learned
- Post-authentication is where the real security gaps emerge in agentic AI
- OAuth scopes can serve as a natural risk classification layer
- Token Vault's interrupt mechanism generalizes to any authorization boundary
- The gap between "agent authenticated" and "agent acted securely" is where observability matters most
What's next for Agent Observatory
- Deploy FGA model to a real Auth0 FGA instance
- Add CIBA-based step-up authorization (requires Enterprise plan)
- Expand to multi-agent architectures (ASI07 coverage)
- Build a Token Vault diagnostic API (submitted as feedback to Auth0)
Bonus Blog Post
What Happens After Your Agent Authenticates? Building Post-Auth Observability with Auth0 Token Vault
The Gap Nobody Shipped a Fix For
At RSAC 2026, five major vendors — Cisco, CrowdStrike, Microsoft, IBM (partnered with Auth0 and Yubico), and Okta — each shipped agent identity frameworks. VentureBeat's analysis identified a shared blind spot: every framework verified who the agent was, but none tracked what the agent did after authentication succeeded. As the report put it, "nothing in the stack validates what happens after authentication succeeds" (VentureBeat, March 2026).
This gap is not theoretical. The OWASP Top 10 for Agentic Applications, released in December 2025 by over 100 security researchers, catalogues the risks that emerge precisely in this post-authentication space: agents hijacking goals (ASI01), misusing legitimately granted tools (ASI02), and abusing inherited privileges (ASI03) — all scenarios where the agent holds valid credentials but acts outside the user's intent (OWASP GenAI Security Project).
This project set out to explore a concrete question: can Auth0's existing identity primitives — Token Vault, fine-grained authorization, and interrupt-driven consent — be composed into a system that doesn't just authenticate an agent, but makes the agent's post-authentication behavior observable and controllable?
What Token Vault Actually Provides
Auth0's Token Vault is built on the OAuth 2.0 Token Exchange standard defined in RFC 8693. In practice, this means an application exchanges a valid Auth0 token (a refresh token or an access token) for an external provider's access token through a dedicated grant type (urn:auth0:params:oauth:grant-type:token-exchange:federated-connection-access-token). The external provider tokens are stored server-side by Auth0; they never reach the frontend or the AI agent directly. When a tool needs to call an external API, the SDK calls getAccessTokenFromTokenVault() within the tool's execution context, retrieves the scoped credential, and the agent uses it for exactly that operation (Auth0 Token Vault Docs).
This architecture provides clear delegation boundaries by design: the user authenticates with the external provider once, grants specific OAuth scopes, and the agent can only use those scopes through the Token Vault exchange. Auth0 currently supports 30+ pre-built integrations (Google, GitHub, Slack, Salesforce, Figma, Spotify, among others), plus any custom OAuth 2.0 provider (Auth0 Integrations).
The SDK implements an interrupt mechanism: when a tool wrapped with withTokenVault() cannot obtain a credential — because the user hasn't connected the account yet, or because additional scopes are needed — it throws a TokenVaultInterrupt that pauses execution and surfaces a consent UI to the user. This is the "step-up authorization" pattern: the agent operates with minimal permissions until it hits a boundary, then requests exactly the additional permission it needs, with the user's explicit approval.
What We Found When We Looked Closely
Before writing code, we conducted a systematic review of Auth0's AI agent SDK ecosystem — the auth0-ai-js monorepo (8 TypeScript packages), the auth0-ai-python packages (4 packages), the official sample applications, over 40 GitHub issues across Auth0 repositories, the Auth0 Community Forum, and relevant academic literature.
Three findings shaped our architectural decisions:
1. Token Vault setup has no debugging feedback loop.
The most reported pain point across hackathon participants and general developers is the Token Vault configuration process. Issue auth0-samples/auth0-ai-samples#66 documents a developer who followed every setup step correctly — enabling the token exchange grant type, configuring Offline Access, setting the social connection, putting the Google OAuth app in Production mode — and still received the same uninformative error: Federated connection Refresh Token not found. There is no built-in way to check whether Token Vault has actually stored tokens for a given user, inspect the state of stored token sets, or identify which of the roughly ten configuration steps failed. The error surface is a single opaque message regardless of root cause.
2. Federated connection errors are silently discarded.
Issue auth0/auth0-ai-js#175 (open at time of writing) documents that errors from the federated connection flow in TokenVaultAuthorizerBase are caught and silently discarded — no logging, no error propagation, no user notification. When Token Vault fails, developers and users receive no signal at all. In the context of an AI agent executing a multi-step workflow, a silent credential failure can cause the agent to proceed without the data it needed, producing incorrect or incomplete results with no indication of why.
3. The post-authentication gap is real and structural.
Academic work has formalized this problem. South et al.'s position paper at ICML 2025 argues that "authenticated and auditable delegation of authority to AI agents is a critical component of mitigating practical risks," proposing extensions to OAuth 2.0 and OpenID Connect with agent-specific credentials that maintain chains of accountability (arXiv:2501.09674). The "Agentic JWT" protocol (Goswami, 2025) goes further, proposing intent-binding tokens that tie each agent action to verifiable user intent (arXiv:2509.13597). The theoretical infrastructure exists; what's missing is practical tooling that makes post-authentication agent behavior visible.
Our Approach: Agent Observatory
We built Agent Observatory as a layer on top of Auth0's existing primitives — not replacing Token Vault, but instrumenting it. The core design principle: every token exchange, every tool call, and every authorization decision should produce an observable event.
Multi-service Token Vault integration. The application connects to three distinct external API domains through Token Vault — Google Calendar (productivity), GitHub (developer tooling), and Slack (communication) — to demonstrate credential orchestration across services with different OAuth scopes and trust levels.
Post-authentication audit trail. Every getAccessTokenFromTokenVault() call is wrapped in instrumentation that records: which tool requested the credential, what scopes were used, when the token was exchanged, and whether the operation succeeded. This produces a per-session audit log that makes the agent's post-authentication behavior visible to the user in real time.
Token lifecycle visualization. A diagnostic panel shows the state of each connected account's tokens — when they were issued, when they expire, and when refresh cycles occur. This directly addresses the debugging gap: instead of encountering Refresh Token not found with no context, a developer or user can see the token state at each step.
Risk-based step-up authorization. The OWASP Agentic Top 10 provides a concrete risk taxonomy. We map each tool call against relevant risk categories — a cross-service data read touches ASI03 (Identity & Privilege Abuse); a write operation to an external service touches ASI02 (Tool Misuse). Operations that cross a configurable risk threshold trigger a step-up authorization flow via Auth0's interrupt mechanism, requiring the user to explicitly approve before the agent proceeds.
Fine-grained authorization for service access. Following the Auth0 FGA pattern (built on OpenFGA concepts, a CNCF sandbox project), service-level and scope-level access control ensures that the agent can only access services and scopes the specific user has authorized — addressing ASI03 (Identity & Privilege Abuse) at the authorization layer. In production, this pattern would be deployed to a real Auth0 FGA instance for document-level access control.
Patterns We Identified
Three authorization patterns emerged from this work that we believe are relevant beyond our specific implementation:
Pattern 1: Credential-Event Correlation. By logging token exchange events alongside tool execution events, it becomes possible to answer questions like "which tool calls consumed credentials from Service X in the last hour?" This is the minimal post-authentication observability that the RSAC 2026 analysis identified as missing.
Pattern 2: Scope-Bound Risk Classification. OAuth scopes already encode what an agent can do. By classifying scopes into risk tiers (read-only vs. write vs. administrative), it's possible to compute a per-operation risk score without additional infrastructure. The calendar.freebusy scope (read availability) carries lower risk than gmail.send (send emails on behalf of user). The OWASP Agentic Top 10 categories provide a natural taxonomy for this classification.
Pattern 3: Interrupt-as-Circuit-Breaker. Auth0's TokenVaultInterrupt mechanism is designed for consent flows, but it generalizes to any authorization boundary. When a risk threshold is exceeded, throwing an interrupt pauses agent execution and surfaces the decision to the user — effectively implementing a circuit breaker pattern at the authorization layer. This converts the post-authentication gap from a silent failure mode into an explicit control point.
What This Means for the Ecosystem
The identity infrastructure for AI agents is evolving rapidly. Six active IETF Internet-Drafts address agent authentication and authorization, including proposals for agent identity management systems (AIMS), SCIM schemas for non-human identities, and trust scoring for autonomous agent transactions. Auth0's Token Vault, by implementing RFC 8693 token exchange with a managed credential store and framework-native SDKs, provides a practical foundation that aligns with these emerging standards.
The gap is not in authentication — Auth0 handles that well. The gap is in what happens next. Making post-authentication agent behavior observable, auditable, and controllable is the next essential capability. The primitives already exist in Auth0's platform; they need to be composed into patterns that the broader AI agent developer community can adopt.
We hope the patterns documented here — credential-event correlation, scope-bound risk classification, and interrupt-as-circuit-breaker — contribute useful building blocks toward that goal.
Built with Auth0 Token Vault, Auth0 FGA, Next.js 16, and Vercel AI SDK v6. Source code: github.com/SunflowersLwtech/Astrolabe
References:
- South, T. et al. (2025). "Position: AI Agents Need Authenticated Delegation." ICML 2025. arXiv:2501.09674
- Goswami, A. (2025). "Agentic JWT: A Secure Delegation Protocol for Autonomous AI Agents." arXiv:2509.13597
- OWASP GenAI Security Project. (2025). "OWASP Top 10 for Agentic Applications." genai.owasp.org
- VentureBeat. (2026). "RSAC 2026 shipped five agent identity frameworks and left three critical gaps open." venturebeat.com
- IETF. (2025). "OAuth 2.0 Token Exchange." RFC 8693
- OpenFGA. "Fine Grained Authorization at Scale." openfga.dev
- Auth0. "Token Vault for AI Agents." auth0.com/ai/docs/intro/token-vault
Built With
- auth0
- auth0-token-vault
- gpt-4o
- next.js
- openai
- react
- remotion
- shadcn-ui
- tailwindcss
- typescript
- upstash-redis
- vercel-ai-sdk
Log in or sign up for Devpost to join the conversation.