ProtoSim — Train AI Agents in Your Browser

What is ProtoSim?

ProtoSim is the first browser-based platform for training AI agents with reinforcement learning. Select a robot, hit "Train," and watch it learn to walk, balance, or manipulate objects — all inside a web browser. No Python. No CUDA. No cloud credits. No setup.

How it works

Under the hood, ProtoSim runs the MuJoCo physics engine compiled to WebAssembly (WASM) and TensorFlow.js for neural network training — both in the browser. The agent runs Proximal Policy Optimization (PPO) with configurable architectures (MLP, LSTM, GRU, TCN) selected through a UI panel. Every episode loops through simulation → observation → action → reward, rendering at 30+ FPS while the chart climbs in real time.

Built with

  • MuJoCo WASM — physics simulation compiled to WebAssembly
  • TensorFlow.js — WebGL-accelerated neural network training
  • React + TypeScript + Vite — modern frontend toolchain
  • Model Hub — 23 pre-trained models and 14 datasets for transfer learning

Challenges

GPU requirement for heavy models — While ProtoSim runs in any modern browser, training complex humanoid robots (like Unitree G1 with its 50+ STL meshes and 26 actuated joints) or using larger network architectures (LSTM, TCN) is computationally intensive. A dedicated GPU with WebGL support is strongly recommended for fast training on heavy models — integrated graphics will work for simple agents like Cartpole, but humanoid training will run significantly slower without a discrete GPU.

MuJoCo file loading in the browser — MuJoCo's Python bindings assume a filesystem. In the browser, every mesh, texture, and XML must be fetched and resolved at runtime. We built a custom include resolver that handles nested XML references, case-insensitive mesh lookups (many STL files use inconsistent casing), and duplicate geom name conflicts — all without a server.

TypeScript strictness with TF.js — TensorFlow.js layers return union types spanning `Tensor | SymbolicTensor | Tensor[] | SymbolicTensor[]`. TypeScript's advanced inference flagged every `.apply()` call. We re-architected type casting across the entire RL pipeline using double-cast patterns (`as unknown as`) to satisfy the compiler.

Checkpoint compatibility across architectures — Switching from MLP to LSTM changes the network topology, making saved weights incompatible. We added architecture metadata to checkpoint JSON and built a validation layer that detects mismatches on import, deletes stale checkpoints, and gracefully resets.

Humanoid robot loading — The Unitree G1 robot has 50+ STL mesh files. Getting it to load reliably in the browser required rebuilding the robot manifest system three times — from file input, to import.meta.glob, to a static `/public/robots/` manifest.json approach.

What we learned

Browser-based ML is viable today. WASM and WebGL are mature enough to run real physics simulation and neural network training at interactive speeds. The bottleneck isn't performance — it's ergonomics. Making RL as easy as pressing "Play" is the unlock.

The pitch

ProtoSim exists because training AI should be as simple as opening a URL. One click and you're watching a robot learn.

final

The project is just a demo of my idea and i am still working on improving it a lot more than that.

Built With

Share this project:

Updates