NextAgent - Your business AI agent

The bridge between your existing business tools and AI — without rebuilding anything.

The Problem

AI is the most transformative technology in a generation. But for most businesses, it can't touch the tools where actual work happens.

No AI for Admin Panel

Only a tiny fraction of SaaS products — Slack, Shopify, Salesforce, Notion, Stripe — have official AI integrations. The rest? ERPs, admin panels, vendor portals, internal tools — completely disconnected from AI.

Where businesses actually spend their time

The AI industry focuses on tools that already have APIs. But that's not where the work is. The average mid-size company depends on 8-15 web-based tools daily, and the vast majority are closed systems with a browser UI as the only interface.

These aren't obscure tools. They're the backbone of daily operations:

  • ERP systems — SAP, Oracle, or custom-built. Runs the entire supply chain. No public API.
  • Internal admin panels — built by a contractor 5 years ago. No documentation. No one wants to touch the code.
  • Vendor/supplier portals — login, check order status, create purchase orders. Each vendor has a different portal.
  • Warehouse management (WMS) — inventory tracking, picking lists, shipping labels. Web interface only.
  • Legacy CRMs — the sales team has used it for 8 years and refuses to migrate. Works fine, zero AI.
  • Government compliance portals — tax filings, license renewals, regulatory reports. Form-based, manual.
  • Accounting software — invoicing, reconciliation, payments. Locked behind a login with no API.
  • Shipping platforms — tracking, label generation, dispatch. Different carrier, different portal.
  • Industry vertical SaaS — healthcare, logistics, manufacturing tools built for a niche with no AI roadmap.

All have web UIs. All make HTTP API calls under the hood. None are accessible to AI. This is where NextAgent operates.

The daily reality: copy-paste across 8 browser tabs

A typical operations team member performs the same cross-system workflows every day. Each step is manual — log in, navigate, copy a value, switch tabs, paste it, click submit, wait, repeat.

Business Daily Workflow This isn't one workflow. It's dozens, every day:

Workflow Systems involved Manual time With NextAgent
Inventory check & reorder WMS → ERP → Vendor portal → Storefront 40 min/day 30 seconds
Order reconciliation Shopify → Warehouse → Accounting 35 min/day 30 seconds
Returns processing CRM → Carrier portal → Inventory → Refund 15-20 min per return 1 minute
Vendor price updates Email → Vendor portal → ERP → Storefront 2-3 hours when it happens 1 minute
Customer data sync Website → CRM → Shipping → Accounting 10 min per customer 30 seconds
Compliance reporting 3 systems → Government portal → File 4+ hours monthly 30 seconds

The cost of doing nothing

These aren't edge cases — they're the core of daily operations. When you add them up across a team, the numbers are staggering.

Metric Value
Wasted time per employee 520 hrs/year (2 hrs/day × 260 working days)
Copy-paste error rate 3-5% — wrong SKU, wrong price, missed order
Team of 5 total waste 2,600 hrs/year — a full-time employee doing nothing but data transfer
Opportunity cost Staff doing manual data entry instead of customer service, strategy, growth

The three paths to AI-powered operations and why two of them fail

Comparable AI Agent Mode

Path 1: Custom API integration — reverse-engineer each tool's API, build a middleware layer, handle auth and edge cases, write tests, maintain it when the tool updates. Cost: $15-30K per tool, 4-8 weeks engineering. For 10 tools, that's $150-300K and 6-12 months before AI works. Too slow, too expensive.

Path 2: Screenshot-based agents (Anthropic Computer Use, OpenAI Operator, etc.) — the AI looks at screenshots and simulates mouse clicks. Sounds magical, but it's 8-15 seconds per action, costs $0.02-0.04 per action in vision tokens, and breaks when the UI changes, shows a CAPTCHA, or renders a loading spinner. Too slow, too brittle, too costly.

Path 3: NextAgent — record yourself using the tool for 5 minutes. NextAgent captures the HTTP calls the browser makes, reverse-engineers the API, and generates tool definitions the AI can call directly. Setup: 5 minutes per tool. Speed: 0.2-0.5 seconds per action. Cost: $0.003 per action. Fast, cheap, reliable.

The hidden insight: every web app already has an API

When you click "Submit Order" in your vendor portal, your browser doesn't send a click event to the server. It sends POST /api/v2/orders { items: [...], shipping: "express" }. That HTTP call IS the API — it's just not documented, not public, and not designed for external use. NextAgent captures those calls and makes them usable.

What is behind of API call

What is NextAgent?

NextAgent is an AI agent platform that connects to the business tools your company already uses and makes them automatable by AI — without requiring any API documentation, developer access, or modifications to the existing tools.

Core idea: you show the AI how you use a tool by recording yourself, and the AI learns how to use it too.

How it works

https://youtu.be/5o0ncOxrTp8

The three-phase pipeline

The three-phase pipeline: Capture → Learn → Use

Phase 1: Capture — "Show the AI what you do"

The user installs a Chrome extension and clicks "Record" before performing their workflow. While the user works normally, the extension captures three streams simultaneously:

Network capture — every HTTP request the web application makes is intercepted via Chrome DevTools Protocol. The extension records the URL, method, headers, request body, response status, and response body. A multi-layer filter removes noise.

DOM event tracking — a content script listens for user interactions: clicks on buttons and links, form submissions, text input, and navigation events. For each interaction, it captures a CSS selector path, the element's visible text, and the surrounding HTML context.

Screenshot capture — on each meaningful user action, the extension takes a JPEG screenshot (~30-50KB). This provides visual context for understanding complex UIs.

These three streams are correlated by timestamp. A click at t=1200ms is linked to the API call that fires at t=1350ms.

The noise problem — and how we solve it

A single page load fires 200+ HTTP requests. Most are CSS, images, tracking pixels, and CDN fetches. NextAgent's multi-stage filter keeps only the real API calls.

Network capture filtering pipeline

Filter stage What it removes Cumulative reduction
Content-type gate Images, CSS, fonts, HTML pages ~60% removed
Origin + URL pattern Analytics, ads, CDN, third-party scripts ~80% removed
Dedup + throttle Polling, heartbeats, duplicate requests ~90% removed
Result 15-25 clean API calls per session

Phase 2: Learn — "The AI figures out the API"

How NextAgent Learn

The recorded session is sent to Claude for analysis. The LLM receives the structured action data and performs:

  • Endpoint normalization/api/users/482 and /api/users/1057GET /api/users/{id}
  • Schema inference — infers parameter types, required fields, enums, and defaults from samples
  • Auth detection — identifies Bearer tokens, API keys, cookies, CSRF tokens from request headers
  • Intent mapping — "user clicked Submit Order" → "Place a new order with line items"
  • Workflow discovery — sequential actions grouped into multi-step workflows with data flow
  • Merge on re-recording — new discoveries merged into existing profiles, not duplicated

The output is a site profile — a JSON document with tool definitions in JSON Schema format, ready for MCP.

Phase 3: Use — "The AI can now automate it"

The site profile's tools are served as a local MCP (Model Context Protocol) server. When a user chats, the AI sees the discovered tools alongside built-in tools and can call them directly.

When the AI calls search_products({ query: "wireless keyboard", in_stock: true }), the local MCP server resolves the endpoint, applies auth, makes the HTTP call, and returns the result — all in 200-500ms.

How NextAgent automate

Why API-level beats screenshot-level

Products like browser automation agents take a different approach: they look at screenshots and simulate mouse clicks. NextAgent's API-level approach is fundamentally superior.

Cost: Screenshot agent vs NextAgent — 10x cheaper

Screenshot analysis requires sending high-resolution images to a vision model for every step. API-level automation sends only structured text. The gap compounds at scale.

Full comparison

Failure scenario Screenshot agent NextAgent (API)
Website redesign Breaks — buttons moved Unaffected — API unchanged
A/B test variant May see different layout Unaffected
Loading spinner Must wait and retry Instant API response
Pop-up / cookie banner Gets confused Unaffected
Anti-bot / CAPTCHA Blocked entirely Normal API traffic
Scale to 100 parallel 100 browser instances needed 100 HTTP calls (trivial)

The design principle: Use vision to learn, use APIs to execute. NextAgent uses screenshots during exploration and recording (to understand the UI), but all execution happens at the API level — fast, cheap, and reliable.

Autonomous exploration

Beyond recording, NextAgent supports autonomous browser exploration. The user asks: "Find all product categories on nike.com" — and the AI agent:

  1. Opens a new browser tab to the URL
  2. Takes a screenshot and analyzes the page visually
  3. Extracts the navigation menu structure
  4. Hovers over each menu item to reveal dropdowns
  5. Screenshots each expanded state to read contents
  6. Compiles a structured list of everything found
  7. Closes the tab and returns results

The AI sees the page as a human would, reasons about what to explore next, and systematically extracts information — useful for competitive research, catalogue mapping, site auditing, and reconnaissance before recording.

A concrete example

A mid-size e-commerce company uses 8 web-based tools daily: Shopify admin, warehouse WMS, shipping portal, accounting software, customer support tool, analytics dashboard, vendor portal, and returns management.

Before NextAgent: 5 operations staff spend 2 hours/day each on cross-system manual workflows.

After NextAgent: Same workflows completed via natural language in minutes.

Annual cost to automate 8 business tools

Approach Annual cost Time to deploy Reliability
Manual labor (status quo) $78,000/year N/A Human error rate 3-5%
Custom API integrations $35,000/year 6-12 months High, but costly to maintain
Screenshot agents $14,000/year 1-2 weeks Brittle — breaks on UI changes
NextAgent $4,000/year 1 day API-level — immune to UI changes

ROI calculation

Factor Value
Labor saved 2 hrs/day × 5 staff × 260 days × $30/hr = $78,000/year
NextAgent cost LLM tokens ~$200/month + infrastructure = $4,000/year
Net savings $74,000/year
ROI 20x in the first year

Architecture

NextAgentArt

Technology

Component Technology Purpose
Backend NestJS + TypeScript API server, chat streaming, tool execution
Frontend React + Vite + Tailwind Chat UI, browser control, recorder management
Database PostgreSQL 16 Conversations, tools, site profiles, recordings
Cache Redis 7 Streaming checkpoints, model cache
Auth Keycloak 24 JWT-based authentication
LLM Claude Sonnet 4 via OpenRouter Chat, tool calling, recording analysis
Extension Chrome Manifest V3 Network capture, DOM tracking, browser automation
Protocol Model Context Protocol (MCP) Tool interoperability standard

Roadmap

Delivered (current codebase)

  • Real-time streaming chat with tool execution
  • Remote browser control via Chrome extension (25+ CDP commands)
  • Agent task decomposition with multi-step planning
  • File operations, web search, code execution tools
  • External MCP server integration
  • Network interception and footprint tracking
  • PDF generation, chart rendering, interactive UI blocks
  • BYOK (Bring Your Own API Key) support
  • Long-term memory system

In development

  • API Recording Engine — enhanced capture with screenshot + network + DOM correlation
  • API Discovery Service — LLM-powered analysis of recordings into tool definitions
  • Local MCP Server — serve discovered tools through the existing MCP pipeline
  • Autonomous Browser Exploration — AI-driven site navigation with vision
  • Recording merge — multiple sessions enriching the same site profile

Future

  • Scheduled automation — recurring workflows on cron
  • Multi-user tool sharing — team-wide discovered tool libraries
  • Visual workflow builder — drag-and-drop composition of discovered tools
  • Webhook triggers — start automations from external events
  • OAuth flow handling — automatic token refresh for discovered APIs
  • Mobile app support — extend beyond Chrome
  • On-premise deployment — Docker-based self-hosted for security-sensitive industries

How to evaluate the impact

Before NextAgent

A staff member spends 2-3 hours daily on cross-system tasks: checking inventory in the WMS, creating purchase orders in the vendor portal, updating stock in Shopify, filing shipping requests, reconciling orders. Each task involves logging into a system, navigating to the right page, copying data, switching tabs, and pasting.

After NextAgent

The same staff member records each workflow once (30 minutes total). Now they say:

  • "Check which products are below reorder point and create purchase orders"
  • "Reconcile today's shipped orders between Shopify and the warehouse"
  • "Find all returns from last week and update inventory accordingly"

Each previously took 20-40 minutes. With NextAgent: 30-60 seconds.

NextAgent because the fastest path to AI-powered operations isn't rebuilding your tools. It's teaching AI to use the ones you already have.

Built With

  • ai
  • nextjs
  • openrouter
Share this project:

Updates