NextAgent - Your business AI agent

The bridge between your existing business tools and AI — without rebuilding anything.

The Problem

AI is the most transformative technology in a generation. But for most businesses, it can't touch the tools where actual work happens.

No AI for Admin Panel

Only a tiny fraction of SaaS products — Slack, Shopify, Salesforce, Notion, Stripe — have official AI integrations. The rest? ERPs, admin panels, vendor portals, internal tools — completely disconnected from AI.

Where businesses actually spend their time

The AI industry focuses on tools that already have APIs. But that's not where the work is. The average mid-size company depends on 8-15 web-based tools daily, and the vast majority are closed systems with a browser UI as the only interface.

These aren't obscure tools. They're the backbone of daily operations:

ERP systems — SAP, Oracle, or custom-built. Runs the entire supply chain. No public API.
Internal admin panels — built by a contractor 5 years ago. No documentation. No one wants to touch the code.
Vendor/supplier portals — login, check order status, create purchase orders. Each vendor has a different portal.
Warehouse management (WMS) — inventory tracking, picking lists, shipping labels. Web interface only.
Legacy CRMs — the sales team has used it for 8 years and refuses to migrate. Works fine, zero AI.
Government compliance portals — tax filings, license renewals, regulatory reports. Form-based, manual.
Accounting software — invoicing, reconciliation, payments. Locked behind a login with no API.
Shipping platforms — tracking, label generation, dispatch. Different carrier, different portal.
Industry vertical SaaS — healthcare, logistics, manufacturing tools built for a niche with no AI roadmap.

All have web UIs. All make HTTP API calls under the hood. None are accessible to AI. This is where NextAgent operates.

The daily reality: copy-paste across 8 browser tabs

A typical operations team member performs the same cross-system workflows every day. Each step is manual — log in, navigate, copy a value, switch tabs, paste it, click submit, wait, repeat.

Business Daily Workflow This isn't one workflow. It's dozens, every day:

Workflow	Systems involved	Manual time	With NextAgent
Inventory check & reorder	WMS → ERP → Vendor portal → Storefront	40 min/day	30 seconds
Order reconciliation	Shopify → Warehouse → Accounting	35 min/day	30 seconds
Returns processing	CRM → Carrier portal → Inventory → Refund	15-20 min per return	1 minute
Vendor price updates	Email → Vendor portal → ERP → Storefront	2-3 hours when it happens	1 minute
Customer data sync	Website → CRM → Shipping → Accounting	10 min per customer	30 seconds
Compliance reporting	3 systems → Government portal → File	4+ hours monthly	30 seconds

The cost of doing nothing

These aren't edge cases — they're the core of daily operations. When you add them up across a team, the numbers are staggering.

Metric	Value
Wasted time per employee	520 hrs/year (2 hrs/day × 260 working days)
Copy-paste error rate	3-5% — wrong SKU, wrong price, missed order
Team of 5 total waste	2,600 hrs/year — a full-time employee doing nothing but data transfer
Opportunity cost	Staff doing manual data entry instead of customer service, strategy, growth

The three paths to AI-powered operations and why two of them fail

Comparable AI Agent Mode

Path 1: Custom API integration — reverse-engineer each tool's API, build a middleware layer, handle auth and edge cases, write tests, maintain it when the tool updates. Cost: $15-30K per tool, 4-8 weeks engineering. For 10 tools, that's $150-300K and 6-12 months before AI works. Too slow, too expensive.

Path 2: Screenshot-based agents (Anthropic Computer Use, OpenAI Operator, etc.) — the AI looks at screenshots and simulates mouse clicks. Sounds magical, but it's 8-15 seconds per action, costs $0.02-0.04 per action in vision tokens, and breaks when the UI changes, shows a CAPTCHA, or renders a loading spinner. Too slow, too brittle, too costly.

Path 3: NextAgent — record yourself using the tool for 5 minutes. NextAgent captures the HTTP calls the browser makes, reverse-engineers the API, and generates tool definitions the AI can call directly. Setup: 5 minutes per tool. Speed: 0.2-0.5 seconds per action. Cost: $0.003 per action. Fast, cheap, reliable.

The hidden insight: every web app already has an API

When you click "Submit Order" in your vendor portal, your browser doesn't send a click event to the server. It sends POST /api/v2/orders { items: [...], shipping: "express" }. That HTTP call IS the API — it's just not documented, not public, and not designed for external use. NextAgent captures those calls and makes them usable.

What is behind of API call

What is NextAgent?

NextAgent is an AI agent platform that connects to the business tools your company already uses and makes them automatable by AI — without requiring any API documentation, developer access, or modifications to the existing tools.

Core idea: you show the AI how you use a tool by recording yourself, and the AI learns how to use it too.

How it works

https://youtu.be/5o0ncOxrTp8

The three-phase pipeline

The three-phase pipeline: Capture → Learn → Use

Phase 1: Capture — "Show the AI what you do"

The user installs a Chrome extension and clicks "Record" before performing their workflow. While the user works normally, the extension captures three streams simultaneously:

Network capture — every HTTP request the web application makes is intercepted via Chrome DevTools Protocol. The extension records the URL, method, headers, request body, response status, and response body. A multi-layer filter removes noise.

DOM event tracking — a content script listens for user interactions: clicks on buttons and links, form submissions, text input, and navigation events. For each interaction, it captures a CSS selector path, the element's visible text, and the surrounding HTML context.

Screenshot capture — on each meaningful user action, the extension takes a JPEG screenshot (~30-50KB). This provides visual context for understanding complex UIs.

These three streams are correlated by timestamp. A click at t=1200ms is linked to the API call that fires at t=1350ms.

The noise problem — and how we solve it

A single page load fires 200+ HTTP requests. Most are CSS, images, tracking pixels, and CDN fetches. NextAgent's multi-stage filter keeps only the real API calls.

Network capture filtering pipeline

Filter stage	What it removes	Cumulative reduction
Content-type gate	Images, CSS, fonts, HTML pages	~60% removed
Origin + URL pattern	Analytics, ads, CDN, third-party scripts	~80% removed
Dedup + throttle	Polling, heartbeats, duplicate requests	~90% removed
Result	15-25 clean API calls per session

Phase 2: Learn — "The AI figures out the API"

How NextAgent Learn

The recorded session is sent to Claude for analysis. The LLM receives the structured action data and performs:

Endpoint normalization — /api/users/482 and /api/users/1057 → GET /api/users/{id}
Schema inference — infers parameter types, required fields, enums, and defaults from samples
Auth detection — identifies Bearer tokens, API keys, cookies, CSRF tokens from request headers
Intent mapping — "user clicked Submit Order" → "Place a new order with line items"
Workflow discovery — sequential actions grouped into multi-step workflows with data flow
Merge on re-recording — new discoveries merged into existing profiles, not duplicated

The output is a site profile — a JSON document with tool definitions in JSON Schema format, ready for MCP.

Phase 3: Use — "The AI can now automate it"

The site profile's tools are served as a local MCP (Model Context Protocol) server. When a user chats, the AI sees the discovered tools alongside built-in tools and can call them directly.

When the AI calls search_products({ query: "wireless keyboard", in_stock: true }), the local MCP server resolves the endpoint, applies auth, makes the HTTP call, and returns the result — all in 200-500ms.

How NextAgent automate

Why API-level beats screenshot-level

Products like browser automation agents take a different approach: they look at screenshots and simulate mouse clicks. NextAgent's API-level approach is fundamentally superior.

Cost: Screenshot agent vs NextAgent — 10x cheaper

Screenshot analysis requires sending high-resolution images to a vision model for every step. API-level automation sends only structured text. The gap compounds at scale.

Full comparison

Failure scenario	Screenshot agent	NextAgent (API)
Website redesign	Breaks — buttons moved	Unaffected — API unchanged
A/B test variant	May see different layout	Unaffected
Loading spinner	Must wait and retry	Instant API response
Pop-up / cookie banner	Gets confused	Unaffected
Anti-bot / CAPTCHA	Blocked entirely	Normal API traffic
Scale to 100 parallel	100 browser instances needed	100 HTTP calls (trivial)

The design principle: Use vision to learn, use APIs to execute. NextAgent uses screenshots during exploration and recording (to understand the UI), but all execution happens at the API level — fast, cheap, and reliable.

Autonomous exploration

Beyond recording, NextAgent supports autonomous browser exploration. The user asks: "Find all product categories on nike.com" — and the AI agent:

Opens a new browser tab to the URL
Takes a screenshot and analyzes the page visually
Extracts the navigation menu structure
Hovers over each menu item to reveal dropdowns
Screenshots each expanded state to read contents
Compiles a structured list of everything found
Closes the tab and returns results

The AI sees the page as a human would, reasons about what to explore next, and systematically extracts information — useful for competitive research, catalogue mapping, site auditing, and reconnaissance before recording.

A concrete example

A mid-size e-commerce company uses 8 web-based tools daily: Shopify admin, warehouse WMS, shipping portal, accounting software, customer support tool, analytics dashboard, vendor portal, and returns management.

Before NextAgent: 5 operations staff spend 2 hours/day each on cross-system manual workflows.

After NextAgent: Same workflows completed via natural language in minutes.

Annual cost to automate 8 business tools

Approach	Annual cost	Time to deploy	Reliability
Manual labor (status quo)	$78,000/year	N/A	Human error rate 3-5%
Custom API integrations	$35,000/year	6-12 months	High, but costly to maintain
Screenshot agents	$14,000/year	1-2 weeks	Brittle — breaks on UI changes
NextAgent	$4,000/year	1 day	API-level — immune to UI changes

ROI calculation

Factor	Value
Labor saved	2 hrs/day × 5 staff × 260 days × $30/hr = $78,000/year
NextAgent cost	LLM tokens ~$200/month + infrastructure = $4,000/year
Net savings	$74,000/year
ROI	20x in the first year

Architecture

NextAgentArt

Technology

Component	Technology	Purpose
Backend	NestJS + TypeScript	API server, chat streaming, tool execution
Frontend	React + Vite + Tailwind	Chat UI, browser control, recorder management
Database	PostgreSQL 16	Conversations, tools, site profiles, recordings
Cache	Redis 7	Streaming checkpoints, model cache
Auth	Keycloak 24	JWT-based authentication
LLM	Claude Sonnet 4 via OpenRouter	Chat, tool calling, recording analysis
Extension	Chrome Manifest V3	Network capture, DOM tracking, browser automation
Protocol	Model Context Protocol (MCP)	Tool interoperability standard

Roadmap

Delivered (current codebase)

Real-time streaming chat with tool execution
Remote browser control via Chrome extension (25+ CDP commands)
Agent task decomposition with multi-step planning
File operations, web search, code execution tools
External MCP server integration
Network interception and footprint tracking
PDF generation, chart rendering, interactive UI blocks
BYOK (Bring Your Own API Key) support
Long-term memory system

In development

API Recording Engine — enhanced capture with screenshot + network + DOM correlation
API Discovery Service — LLM-powered analysis of recordings into tool definitions
Local MCP Server — serve discovered tools through the existing MCP pipeline
Autonomous Browser Exploration — AI-driven site navigation with vision
Recording merge — multiple sessions enriching the same site profile

Future

Scheduled automation — recurring workflows on cron
Multi-user tool sharing — team-wide discovered tool libraries
Visual workflow builder — drag-and-drop composition of discovered tools
Webhook triggers — start automations from external events
OAuth flow handling — automatic token refresh for discovered APIs
Mobile app support — extend beyond Chrome
On-premise deployment — Docker-based self-hosted for security-sensitive industries

How to evaluate the impact

Before NextAgent

A staff member spends 2-3 hours daily on cross-system tasks: checking inventory in the WMS, creating purchase orders in the vendor portal, updating stock in Shopify, filing shipping requests, reconciling orders. Each task involves logging into a system, navigating to the right page, copying data, switching tabs, and pasting.