Endless Trajectories

Explore Page
Website Graph
Dataset Download

Introduction

Most web agent datasets are broken in the same way: they’re small, hand-written, and tied to frozen versions of websites that change almost instantaneously. That was the starting point for this project.

If you want to train a browser agent today, you usually go download a benchmark like WebArena or Mind2Web, run it against an old snapshot, and optimize on that. Real websites, however, are messy and constantly changing. Buttons get renamed, layouts shift, flows become more complicated, and suddenly the benchmark is measuring performance on a version of the web that doesn’t exist anymore.

We built Endless Trajectories to make web agent data generation dynamic instead of static. The idea is simple. Webpage in, RL dataset out.

Given a website, the system explores the live UI, discovers what the interface can actually do, generates grounded tasks, executes browser trajectories, verifies the results, and exports everything as inspectable training data. Instead of manually writing tasks against static snapshots of webpages, the system generates them directly from the live site.

How it works

The pipeline has several stages.

First, the system explores the website using Amazon Nova Act. Rather than simply crawling links, the explorer tries to understand what actions are possible on each page. It interacts with the site the same way a real user would.

While this exploration is happening, Amazon Nova 2 Lite helps generate exploration goals and classify pages. Instead of hardcoding a crawl strategy, the system asks the model to propose the next objective based on the current page and available UI actions.

As the explorer runs, it builds a state graph of the site. One of the things that became clear early on was that websites rarely behave like simple forward paths. Filters loop back to the same page with different states, modals open on top of pages, and users constantly backtrack. Modeling the site as a state graph allows the system to capture these real interactions.

Once the graph is constructed, the system generates tasks grounded in the discovered structure. This means tasks only reference actions that actually exist on the site, avoiding hallucinated instructions. Tasks are then executed by parallel Nova Act workers, each running its own browser session. During execution, the system records traces, screenshots, and metadata so each trajectory can be inspected later.

Successful trajectories are exported as dataset packs containing execution traces, screenshots, verification metadata, and quality scores. These can be used for training, evaluation, or RL experiments with browser agents.

Challenges

One of the biggest challenges was correctly modeling how websites behaved. Early versions assumed mostly forward navigation paths, but real sites quickly introduced cycles through filters, modals, and dynamic UI state. This required redesigning the representation into a more flexible graph structure.

Another challenge was reliability. Running multiple browser agents in parallel exposed issues like browser popups interfering with automation and malformed outputs from model calls. A significant amount of work went into making the pipeline robust enough to run consistently.

Why Amazon Nova

Amazon Nova was central to the project.

Amazon Nova Act powers the browser interaction layer, including exploration, execution, structured extraction, and trajectory tracing.

Amazon Nova 2 Lite is used for goal generation, page understanding, and task synthesis. Together, they make it possible to build a system that interacts with the web dynamically rather than relying on static scripts.

The bigger idea

Endless Trajectories is not a dataset generator, but rather is a factory.

Instead of relying on benchmarks tied to frozen snapshots, the system can generate an infinite amount of fresh browser-agent training data directly from live websites. As the web evolves, the datasets stay relevant because they are always generated from the current state of the site.