Inspiration
I kept reading reports about AI agents going outside their expected boundaries. And every time I built a multi-agent system myself, I kept hitting the same problem: I had no idea what my own agents were actually doing once they had a token.
Auth0 Token Vault solved the credential problem really well, but the moment a token left the vault, the trail went cold. There’s no log for “the agent called Gmail’s users.messages.send endpoint at 2:47pm with this payload.”
That’s the gap I wanted to fix.
How I Built It
The supervisor is a LangGraph StateGraph running Google’s gemini-2.5-pro via Vertex AI. It uses MessagesAnnotation for the message channel and PostgresSaver as a checkpointer so conversations persist across requests.
The supervisor binds 15 tools across four sub-agents (Calendar, Email, GitHub, Drive). Each tool is just a thin wrapper that hands its arguments to a single function: callProxy.
The Proxy
The proxy is the most important part. It’s a Next.js API route that does five things in order:
Token resolution
It triesauth0.getAccessTokenForConnectionfirst (Token Vault federated token exchange). If that fails, it falls back to the Auth0 Management API to fetch the user’s stored identity.
If the access token is expired, it calls Google’soauth2/tokenendpoint directly with the refresh token that Token Vault keeps.Policy evaluation
A small policy engine checks the request against user-definedAgentPolicyrows in Postgres (allowlist, blocklist, rate limit, time restriction, resource restriction).
Fail-closed by design.Step-up check
If the action is a sensitive write (e.g., sending email, creating an issue), the proxy creates aStepUpEventrow with statuspending, then polls every 5 seconds for up to 60 seconds while the chat UI shows an inline approval card.Request forwarding
It injects the resolved token into the outbound HTTP request and forwards it to the third-party API.Activity logging
It writes a sanitizedAgentActivityrow to Postgres with:- agent type
- endpoint
- method
- scopes used
- response status
- duration
- human-readable description
(e.g., “Email Agent searched Gmail for messages from this week”)
- agent type
Dashboard
The dashboard uses Next.js 16 (App Router), shadcn/ui, and Recharts.
- The activity feed polls every 3 seconds
- The analytics page renders:
- interactive area chart
- donut chart
- stacked bar chart
- ranked service list
- interactive area chart
All powered from the same underlying data.
Challenges
Token expiration
Google access tokens expire after exactly 1 hour. I initially thought my code was broken, but the issue was simply token expiry.
The fix was to call Google’s OAuth endpoint directly using the refresh tokens stored in Token Vault.
Step-up friction (CIBA)
CIBA step-up requires the Auth0 Guardian app on the user’s phone, which adds friction (especially for demos or judges).
I built an in-app polling alternative using StepUpEvent rows in Postgres so approvals happen within the same chat interface.
Vercel deployment
Deploying to Vercel introduced a challenge: Vertex AI typically relies on gcloud Application Default Credentials locally, but serverless environments don’t have gcloud.
Solution:
- Base64-encode a service account JSON
- Store it in a Vercel environment variable
- Decode it to
/tmpon cold start - Point
GOOGLE_APPLICATION_CREDENTIALSto that file
What I Learned
A big realization was that most of the “agent authorization” problem actually lives after token exchange.
Token Vault handles the credential layer well. But once a token is in an agent’s hands, a new set of challenges emerges:
- per-request token injection
- real-time policy evaluation
- task-level consent
- delegation chain tracing
- sanitized observability
These are the problems developers building multi-agent systems actually need to solve—and they don’t fit inside a vault.
Blog - Post
What Building AgenLens Taught Me About Token Vault
I started this project thinking Token Vault would do most of the heavy lifting. OAuth, refresh, secure storage, all the major providers built in. And honestly, it does. The first time I got a Google token back through it I was kind of surprised at how little code I had to write.
Then I tried to build a multi-agent system on top of it and ran into stuff I wasn't expecting.
The first thing that bothered me was that I had no idea what my agents were actually doing once they had a token. Auth0 logs that the exchange happened, and then it goes silent. My calendar agent could call the Gmail API and nobody would notice. That's the gap I built the proxy for. Every sub-agent's API call routes through it, gets logged, and shows up on the dashboard in real time.
The second thing was scoping. Token Vault gives you one token per connection, not per sub-agent. So my calendar and email agents both end up with the same Google token and the same scopes, even though they really shouldn't. I worked around it by injecting narrowly-scoped tokens per request inside the proxy, so the agents themselves never hold the raw credentials.
The hardest one to actually debug was Google access tokens dying after exactly 1 hour. I spent way too long thinking my code was broken before I realized the tokens were just expiring. Ended up calling Google's OAuth endpoint directly with the refresh tokens that Token Vault already stores, just to keep agents alive past the first hour.
Looking back, building on top of Token Vault meant I could spend my time on the problems I actually wanted to solve. Real visibility into what agents are doing. Per-agent scoping that goes beyond what a single connection can express. Approvals that happen in the same window the user is already working in, not on a separate device. None of those turned out to be small problems, and I think they're the next set of things any team building multi-agent systems is going to run into. That's what AgenLens is.
Built With
- auth0
- gemini
- google-cloud
- javascript
- langgraph
- neon
- next.js
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.