Inspiration

Every AI agent integration today follows the same pattern: "Grant access to your Gmail, Calendar, and Tasks." The user clicks "Allow All" because there is no alternative, and the agent immediately holds the keys to their digital life.

We asked: what if AI agents had to earn trust, one permission at a time?

The principle of least privilege is foundational to security, but almost entirely absent from AI agents. Agents request maximum permissions upfront because the tooling makes progressive authorization hard. Auth0 Token Vault changes that — it provides a credential broker that can issue scoped tokens on demand, without the agent ever touching raw OAuth secrets.

But Token Vault alone is not enough. What the ecosystem lacks is the enforcement layer on top: per-agent tool isolation, risk classification, anomaly detection, tamper-proof audit trails, automatic scope expiry, rate limiting, and cryptographic delegation chains.

Scope Lock proves all of this can be built today — a complete authorization framework with 300 automated tests.

What it does

Scope Lock is an Email Triage Agent that helps professionals manage their inbox, calendar, and tasks while enforcing progressive authorization at every layer.

Progressive Mode (default): A single agent starts with zero permissions and earns each scope inline through branded consent cards. After triaging emails, it suggests cross-service follow-ups (check calendar, create task, draft reply) — each requiring a new scope grant.

Strict Isolation Mode (toggle): Two agents (Reader + Writer) with hard credential boundaries enforced at the SDK layer — not prompt engineering. The Reader Agent physically cannot invoke write tools because they don't exist in its execution context.

Key features:

  • Risk-Tier Policy Engine — GREEN (auto-approve reads), AMBER (warn on writes). All 8 tools classified.
  • Scope Presets — Lockdown (0 tools), Privacy (read-only), Productivity (full access). Double intersection with agent filtering.
  • Branded Consent Cards — Service name, risk level, data access description, TTL display, dismiss/cancel/timeout handling.
  • Active Scopes Bar — Zero-trust start, progressively fills as scopes are granted.
  • SHA-256 Hash-Chained Audit Trail — Every tool call logged with tamper detection. Chain verification via /api/audit/verify.
  • Anomaly Detection — Rapid escalation, high frequency, scope hopping, unusual scope patterns.
  • Scope TTL Auto-Expiration — GREEN=30min, AMBER=10min. Permissions decay automatically.
  • Per-Agent Rate Limiting — Reader=50/5min, Writer=15/5min.
  • Agent Delegation Chains — SHA-256 signed transitions between agents.
  • credentialsContext Tuning'thread' for reads (performance), 'tool-call' for writes (security isolation).
  • Security Dashboard — Security score, scope topology, JWT inspector, scope analytics (pure CSS charts), consent timeline, policy rules, anomaly alerts, delegation chain visualization.
  • Security Sandbox — 14 automated assertions running against the live system.
  • Authorization Matrix — Visual cross-reference of agents × tools × presets.
  • Auth0 Actions Showcase — Post-Login and Token Exchange reference implementations.
  • Rich Authorization Requests (RFC 9396) — Structured transaction payloads for machine-readable audit.
  • 300 Tests across 12 test files + 14 live security assertions.

How we built it

Stack: Next.js 15, Vercel AI SDK, OpenAI GPT-4o, Auth0 Token Vault, TypeScript, Tailwind CSS.

Token Vault Integration: 4 Google service connections with explicit credentialsContext per tool — 'thread' for reads, 'tool-call' for writes. Each withTokenVault() call configures connection, scopes, and credential lifecycle.

Multi-Agent Isolation: Agent profiles define allowed tools. The chat route filters the tool map before passing to streamText()Object.entries(tools).filter(([name]) => allowedToolNames.includes(name)). Tools are physically excluded, not prompt-restricted.

Policy Engine: Maps every tool to GREEN/AMBER/RED with action and auth requirement. Unknown tools default to AMBER (fail-safe).

Audit Chain: SHA-256 hash of previousHash:toolName:scopes:timestamp:success:riskLevel:connection. Genesis hash = 64 zeroes. verifyAuditChain() walks the chain and pinpoints tampering.

Anomaly Detection: 4 pattern classifiers running after every tool call — rapid escalation, high frequency, scope hopping, unusual scope.

Dashboard: Pure CSS visualizations — conic-gradient donuts, flexbox stacked bars, grid topology. Zero charting library dependencies.

Challenges we ran into

  • credentialsContext tuning — One config line with massive security impact. Wrong choice causes either credential failures or unnecessary re-auth.
  • Multi-agent isolation without SDK support — Auth0 AI SDK doesn't natively support per-agent boundaries. We solved it with tool-map filtering.
  • Tamper-proof audit chains — Careful handling of genesis hash, entry ordering, and serialization consistency.
  • Anomaly threshold tuning — Balancing false positives against real attack detection.
  • Scope TTL at application layer — Token Vault doesn't support time-bound grants natively; we enforce expiry client-side.

Accomplishments that we're proud of

  • 314 automated quality checks — 300 unit/integration tests + 14 live security assertions.
  • Progressive Mode as the hero experience — Zero-trust start, branded consent cards, cross-service scope escalation in a single conversation.
  • Tamper-proof audit trail with real SHA-256 hash chain verification.
  • 4 anomaly detection patterns running in real-time.
  • Pure CSS security dashboard with zero chart library dependencies.
  • Domain-specific architecture — Email triage where permission escalation follows natural task progression.

What we learned

  • credentialsContext is the most underrated security primitive in the Auth0 AI SDK.
  • Multi-agent isolation requires enforcement (tool exclusion), not instructions (prompt engineering).
  • Risk classification of tool calls should be a framework feature, not app-specific.
  • Tamper-proof audit trails are mandatory for enterprise agent adoption.
  • Scope expiry transforms the security model from "grant once, persist forever" to natural permission decay.

What's next for Scope Lock

  • Auth0 FGA for document-level access control.
  • MCP Server authentication with Token Vault.
  • Server-side scope TTL enforcement via Token Vault.
  • NPM package for the progressive authorization pattern.

Bonus Blog Post

What Happens When You Treat AI Agent Authorization as a Real Security Problem

Most AI agent demos treat authorization as a checkbox: connect to Gmail, grant all scopes, move on. We wanted to find out what happens when you take it seriously — when you build the audit trail, the anomaly detection, the rate limiting, the scope expiry, and the tamper-proof logging that a production system actually needs.

The answer: you end up building an entire security framework, and you discover that the primitives for it barely exist.

Scope Lock started as an email triage agent. The user asks the agent to check their inbox. The agent needs gmail.readonly. Auth0 Token Vault brokers that credential without the LLM ever seeing the raw token. Simple enough.

But then the user wants to draft a reply. That requires gmail.compose — a write scope. And suddenly we are in different territory. A read operation that exposes data is categorically different from a write operation that creates data on the user's behalf. So we built a policy engine. Every tool call gets classified: GREEN for reads (auto-approve), AMBER for writes (warn and proceed). This model maps directly to real user expectations about what an agent should do silently versus what requires explicit approval.

Then we needed isolation. We built two approaches. The default is Progressive Mode — a single agent with access to all 8 tools that earns each scope inline through branded consent cards. But we also built Strict Isolation Mode with two sub-agents — Reader and Writer — each with a hard boundary around which tools they can access. Not prompt-level restrictions — the Reader Agent's execution context does not contain gmailDraftTool. We filter the tool map by agent ID before passing it to the Vercel AI SDK's streamText. This is enforcement at the SDK layer.

The insight that emerged: Progressive Mode is the better user experience. Users do not want to switch agents — they want a single conversation where permissions accumulate naturally. Strict Isolation proves the security model is real (tools are physically excluded), but Progressive Mode is what you would actually ship.

The most underrated discovery was credentialsContext. This single configuration parameter on each withTokenVault() call controls whether credentials are cached across tool invocations ('thread') or resolved fresh per call ('tool-call'). We use 'thread' for all reads and 'tool-call' for all writes. One line of config. Enormous security impact.

But enforcement without accountability is incomplete. So we built a SHA-256 hash-chained audit trail. Every tool call gets logged with its scopes, connection, risk level, and timestamp. Each entry's hash is computed from the previous entry's hash plus its own payload. Modify any entry, and every subsequent hash breaks.

On top of the audit trail, we added anomaly detection. Four patterns: rapid privilege escalation (GREEN to AMBER within 60 seconds), high-frequency calls (>10 per minute), cross-service scope hopping (3+ connections in 30 seconds), and novel tool usage. These run in real-time after every tool call.

We added scope TTL — time-bound grants that expire automatically. GREEN scopes last 30 minutes. AMBER scopes last 10 minutes. Without active renewal, all permissions decay to zero.

We added per-agent rate limiting, cryptographic delegation chains, Rich Authorization Requests per RFC 9396, and Auth0 Actions for Post-Login and Token Exchange hooks.

The result is 300 automated tests across 12 test files and 14 live security assertions, all passing.

The gap we found is clear: Token Vault handles credential brokering beautifully. What is missing is everything above it — the policy engine, the per-agent isolation, the audit trail, the anomaly detection, the scope expiry, the rate limiting. These are not application-specific concerns. They are universal needs for any agent that touches real user data. Auth0 should ship them as platform features.

We built what the platform does not yet provide. And we proved it works with 300 tests.

Built With

  • auth0
  • ciba
  • gpt-4o
  • nextjs
  • oauth2
  • openai
  • react
  • sha-256
  • tailwindcss
  • token-vault
  • typescript
  • vercel-ai-sdk
Share this project:

Updates