AgentTrust

Network Architecture
Data Flow Diagram

Inspiration

Every week there's a new AI agent demo. From booking flights, filing taxes, or managing calendars. All of which are impressive enough until you ask the fated question: who authorized that? Today's AI agents either operate in sandboxes too restrictive to be useful, or they're given unrestricted access to real accounts with zero accountability and, potentially, significant consequences. There's no identity layer, no policy enforcement, and no audit trail whatsoever. If an agent clicks "confirm purchase" on your behalf, there's no cryptographic proof of what happened, why it happened, or who approved it.

We built AgentTrust because we believe autonomous agents will never leave the demo stage until they have a trust infrastructure as rigorous as what humans rely on. All actions require authentication, authorization, and auditability. Auth0 already solved this problem for human users! We wanted to extend that same trust model to AI agents acting on a user's behalf.

What it does

AgentTrust is the identity, policy, and audit layer that sits between an AI agent and the browser. Every single browser action (i.e clicks, searches, form submissions) must pass through AgentTrust validation before it executes.

Identity-bound execution: Every action carries an Auth0 M2M JWT. The agent authenticates as a machine client, not a user, creating a clear separation of identity.
Pre-execution risk classification: AgentTrust scores every action across domain sensitivity, keyword matching, URL patterns, and form field analysis. Actions are classified as low, medium, high, or blocked before they run.
Human-in-the-loop approval: High-risk actions (sending emails, submitting a payment, logging in with credentials) trigger a real-time approval request in AgentTrust.
Cryptographic audit trail: Every action is linked in a SHA-256 hash chain making it tamperproof! Each entry includes agent identity, risk level, session, and screenshots.
External API access: For API-Integrations, such as GitHub and Google Calendar, the agent calls APIs directly through Auth0's identity infrastructure using Token Vault, bypassing the browser entirely.
Encrypted credential vault: Saved login credentials are encrypted with AES-256-GCM with. Passwords are used by the auto-login engine directly and never enter the LLM context.

How we built it

AgentTrust is three systems working together:

Python Agent: A LangGraph state machine orchestrates a structured pipeline that works in 5 steps: CLASSIFY (User Intent), PLAN (Task Decomposition), OBSERVE (Real-Time Page Capture), ACT (Action Selection), VERIFY (Action Confirmation). An action history RAG retrieves similar past tasks to improve future consistency.

Node.js Backend: Express with a layered security middleware stack (Helmet, CORS, rate limiting, mongo-sanitize, HPP, input validation). Auth0 JWKS validation on every request. A policy engine classifies risk with a multi-signal scoring system. The audit service maintains a SHA-256 hash chain. Credentials are encrypted at rest with AES-256-GCM. CloudWatch Logs receives fire-and-forget structured records for every significant event. PostgreSQL on AWS RDS stores everything.

Chrome Extension: Packed with a service worker, content scripts, and a popup dashboard. The Chat renders user prompts, a collapsible live thinking breakdown, and agent responses. The Permissions tab manages domains, keywords, saved credentials, and OAuth account connections (GitHub, Google) through Auth0's social login flows.

Key technology decisions:

Three-tier model architecture: GPT-4.1 for complex reasoning, GPT-4.1-mini for planning and responses, GPT-4.1-nano for intent classification.
Auth0 as the sole identity provider: M2M tokens for the agent, user JWTs for the extension, Management API for provider token resolution, JWKS for validation.
LangGraph State Management: Forced observation before every action, structured verification after every action, goal tracking with sub-task breakdowns.

Challenges we ran into

The agent kept going in circles: Early on, the agent would navigate to a page, click back, navigate again, and repeat infinitely. We solved this with multi-layered loop detection: tracking recent action signatures, capping consecutive failures at 3 per sub-goal, and adding programmatic URL rewriting guards.

Step-up approval timing was fragile: The agent would fire off an action, get a 403 requiring approval, but then immediately retry before the user had time to respond. We implemented long-polling with a 60-second timeout on the agent side and 2-minute auto-expiry on the backend, so the agent genuinely waits for human input.

Live progress kept disappearing: Prompt IDs weren't propagating reliably across LangGraph nodes. We fixed it by storing progress in an array outside of our LangGraph state, stripping screenshots from polling payloads to reduce size, and implementing DOM diffing so the UI only updates when content actually changes.

The agent hallucinated search results: When asked about current events, the intent classifier would categorize the request as "CHAT" (no browser needed) and the agent would fabricate an answer from its training data. We added real-time keyword detection (weather, today, latest, stock price, news) to force browser classification for anything requiring live information.

Vision feature led to model confusion: Initially, we provided the agent the ability to use the screenshots we take to help detect elements on sites for greater precision/accuracy. This led to the agent being stuck at times and confusing elements on a page which then spiraled into an infinite feedback loop of choosing the wrong elements on a page. Top that with the increased costs of inference by adding images within the context window and it was a obvious we had to forgo the Vision capabilities.

Accomplishments that we're proud of

Zero-bypass enforcement: The driver-level wrapper means there is genuinely no code path that can perform a browser action without AgentTrust validation.
SHA-256 hash chain: Every action's hash depends on the previous action's hash. You can verify the chain at any time and any tampering invalidates the chains.
Multi-model pipeline: Intent classification with GPT-4.1-nano returns in milliseconds. Planning with GPT-4.1-mini is 4x cheaper than the full model. Only action selection uses GPT-4.1. A single task costs roughly 60-70% less than running everything through GPT-4.1.
LLM Transparency: Watching the agent's reasoning steps appear in real-time in the UI allows for clear understanding and auditability behind all actions.
Auth0 as the identity backbone: M2M authentication, JWKS validation, social connections, Management API token resolution, and Token Vault are all handled through Auth0.
Prompt Injection Deterrence: Prompt injection detection is implemented both at the server and page scanning level to deter adversaries to hijack the agent using malicious code within websites.

What we learned

Trust is an infrastructure and design problem, not an AI problem: The AI model doesn't need to be "more trustworthy" — it needs to operate within infrastructure that enforces trust externally. The same way a web app doesn't trust user input, an agent governance layer shouldn't trust agent decisions.
Structured agent pipelines over freeform loops: Overhauling from a looping ReAct agent to a LangGraph state machine with that included observation, verification, and goal tracking eliminated entire categories of bugs.
Identity separation matters: Having the agent authenticate as a machine (M2M) while the human authenticates as a user means the agent can never approve its own high-risk actions.
Browser automation is adversarial: Websites actively fight automation and continuously add enhancements to block agent/bot traffic. Features such as cookie banners, passkey dialogs, QR code popups, dynamic DOM changes, captchas, rate-limiting, ip-blocking, and false websites descriptions were just some of the issues that plagued us. A production agent needs overlay dismissal, stale element recovery, multi-step login handling, and passkey suppression just to function on modern websites.
Payload size kills real-time UIs: Screenshots in API responses were the single biggest cause of extension performance issues. Stripping them from polling endpoints while still maintaining them in our backend for auditing purposes and only including them on demand was the fix that made the live progress UI finally work smoothly.

What's next for AgentTrust

More providers: Expanding beyond GitHub, Google Calendar, and Slack to other tools teams use daily, all through Auth0's Connected Accounts and Token Vault.
Multi-agent governance: Supporting multiple agents with different identity scopes operating in the same session, with cross-agent audit trails and policy enforcement.
Reinforcement learning from audit data: Using the cryptographic audit trail as a training signal. Actions that were approved can reinforce good behavior; denied and step-up actions can train the agent to avoid risky patterns.
Policy templates and marketplace: Pre-built policy configurations for common use cases (e-commerce browsing, financial research, social media management) so teams can deploy agents with appropriate guardrails out of the box.
Enterprise SSO and RBAC: Integrating Auth0 Organizations for multi-tenant deployments where different teams have different agent permissions and approval workflows.
Vision V2: Reassessing our approach to vision and allowing for the agent to physically see what we see on a page can lead to greater accuracy when navigating the web.

Reason for no "Try it out" Link:

Agent Trust is a developer tool that currently runs locally in a controlled environment and has not been published as a public web app, extension listing, or mobile app. Because it depends on local setup, project-specific configuration, and user credentials, we cannot provide a public "try it out" link or APK at this time.

Blog

"The Missing Layer in AI Agents: Why Trust Will Be the Next Billion-Dollar Infrastructure"

AI innovation has accelerated rapidly, moving from simple assistants to autonomous agents capable of executing real-world actions. While this unlocks massive productivity, it also introduces a critical challenge: securely managing what these agents can access and do.

Modern AI agents often rely on sensitive credentials (i.e. API keys, access tokens, and connections to external services) to perform crucial actions. Without proper safeguards, these credentials can become a major vulnerability, leaving systems exposed to unauthorized actions or data breaches.

AgentTrust addresses this by integrating with Auth0 Token Vault to protect and manage credentials at the core of agent operations. Instead of exposing credentials directly to agents, Token Vault securely stores them and enables controlled, identity access. This allows AgentTrust to enforce least-privilege permissions. With this, we can ensure agents can only act within the exact scope they are authorized for.

There is also a clear lack of rigor applied to authentication, authorization, and identity found within these agent systems that have been critical for the success of traditional and enterprise grade software. This leaves holes in applications that create new attack surfaces. From prompt injections, unauthorized monetary transactions, and databases being completely wiped, the possibilities only grow as agent control widens.

At its core, this is an identity and access problem. Platforms like Auth0 have standardized how developers secure user access. The next generation of AI systems requires a similar layer of trust, control, and verification.

Beyond secure token handling, AgentTrust provides real-time monitoring, audit logs, and active user approval controls. Users gain full visibility into every agent action (i.e. the prompt, site, user, and executed task) and are prompted to approve or reject sensitive actions in real time. If an agent attempts something potentially dangerous, it will be immediately flagged, paused, or blocked before execution. Additionally, agents run within isolated execution environments. This helps in minimizing the risk of data leakage and preventing cross-session compromise.

As AI agents become more embedded in production applications, securing credentials and enforcing access control at this level will be essential to building safe and scalable AI-powered systems.

Built With

Updates

brilynd madeya started this project — Mar 23, 2026 11:04 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.