🦾 Artifex: Conversational Robotics Engine

Bridge the gap between natural language and physical simulation. Artifex is a browser-based, AI-powered robotics laboratory that turns your spoken ideas into real-time physical actions — no hardware, no installs, no limits.


💡 Inspiration

Robotics development is notoriously difficult to get into. Between heavy dependencies, expensive physical hardware, and complex control interfaces, the barrier to entry is massive. We asked ourselves:

What if anyone could experiment with a state-of-the-art robotic arm right in their browser, using nothing but natural language?

We were inspired to merge the reasoning capabilities of modern Generative AI with the rigorous physics of MuJoCo — and to democratize robotics research by creating an interactive, frictionless environment where anyone could observe a robotic agent visually "think," "plan," and autonomously correct itself in real-time. By moving $50,000 of hardware dependencies directly into the browser, we are proving that the future of robotics prototyping is accessible to everyone.


🤖 What it does

Artifex is a fully interactive robotics simulation powered by LLMs. Users are presented with a tabletop scene containing colored cubes, a puck, a goalpost, and a highly accurate simulated Franka Emika Panda 7-DOF robotic arm.

Through a premium, glassmorphic conversational interface — designed to be clean and instantly presentation-ready — users simply type commands like:

"Push the puck into the goalpost."

The Core Innovation: Autonomous Continuous Learning

The magic of Artifex is its true autonomy. The AI controller doesn't blindly execute a script — it operates on a continuous learning loop:

  1. 📡 Reads the live rigid-body state from the physics engine
  2. 📐 Calculates the kinematic distance to target
  3. 🦾 Executes the physical action via the simulated arm
  4. 👁️ Observes the result — did the puck reach the goalpost?
  5. 🔁 Self-corrects — if it missed, the agent independently analyzes the coordinate delta, adjusts its approach vector, and retries

You watch this entire autonomous decision loop unfold in stunning, low-latency 3D physics directly in the browser — complete with real-time streaming logs of the agent's internal "thought" process.


🏗️ How we built it

Artifex is engineered as a highly scalable full-stack monorepo, divided into five tightly integrated layers communicating over WebSockets.

1. 🖥️ The Physics UI (Frontend)

Built with React and Vite. The initial UI shell was scaffolded in minutes using Lovable — AI handled the three-panel layout, dark glassmorphic theme, and component wiring so we could focus on real logic instead of boilerplate. The 3D physics engine is MuJoCo (Multi-Joint dynamics with Contact) compiled to WebAssembly, running heavy rigid-body calculations — inverse kinematics, continuous collision detection — purely client-side at 60fps.

2. 💬 The AI Chat Interface

The conversational layer is built on assistant-ui — production-ready React components that gave us streaming chat, tool call rendering, and state management out of the box. We extended it with a custom AgentBridge runtime adapter to intercept WebSocket events and render the agent's live continuous-learning logs as structured, auto-scrolling Markdown blockquotes directly inside the chat thread.

3. 🧠 The AI Controller (Backend)

The brain of the operation is a FastAPI Python server powered by Google DeepMind Gemini models — specifically gemini-robotics-er-1.5-preview — for advanced spatial reasoning. The model translates natural language intent into spatial coordinates \((x, y, z)\) and calculates action trajectories using Euclidean distance:

$$ d = \sqrt{(x_{target} - x_{current})^2 + (y_{target} - y_{current})^2 + (z_{target} - z_{current})^2} $$

This distance metric feeds directly into the autonomous correction loop — if \(d > \epsilon\) after execution, the agent re-plans and retries.

The entire backend was built with Augment Code — a developer workspace where agents are coordinated, specs stay alive, and every workspace is isolated. Augment held full monorepo context across sessions, coordinated multi-file edits, and let our team ship complex backend features at agent speed.

4. ⚡ The Real-Time Bridge

The backend streams live execution states across a persistent WebSocket connection:

thinking  →  plan  →  executing  →  observing  →  correcting  →  result

Our custom @assistant-ui/react adapters catch every state transition and vividly display the agent's self-correction process as structured UI blockquotes — making the AI's reasoning fully transparent and auditable.

5. ☁️ Cloud Deployment

The entire monorepo is hosted on DigitalOcean App Platform. DigitalOcean's straightforward platform made deploying a full-stack WebSocket monorepo surprisingly painless — from git push to a live, secure production domain in under an hour. We configured intelligent path-trimmed routing to serve both the static Vite frontend and the dynamic Python backend over a single unified domain.


🧗 Challenges we ran into

Challenge What Happened How We Solved It
Deployment Routing Conflicts UI and Backend competed for the root web path Engineered DigitalOcean path-trimming (e.g. /artifex-agent) so FastAPI correctly caught ws:// upgrade requests without a 404
Package Manager Clashes Upgrading to the speedier uv bundler caused cloud buildpack failures Pinned the exact Python 3.12.8 runtime inside the container and purged all legacy requirements.txt files
UI State Hydration Instantaneous WebSocket streams caused rendering jank in React Built a custom AgentBridge adapter to safely buffer and format the agent's continuous-learning thought-logs into smooth, auto-scrolling UI elements

🏆 Accomplishments that we're proud of

  • 🔁 Autonomous Correction Loop — The agent's ability to observe a missed physics interaction (like missing the goalpost), analyze the coordinate delta, and continuously self-correct without any human intervention is a genuine demonstration of reliable LLM-driven autonomy in a physical environment.

  • 🌐 Zero-Install Physics — MuJoCo running natively in the browser via WebAssembly means anyone can try Artifex instantly — no Linux environment, no ROS stack, no gigabyte downloads.

  • Premium User Experience — The presentation-ready chat panel, dynamic suggestion cards, Markdown-formatted learning logs, and dark glassmorphic styling make Artifex feel like a next-generation product, not a hackathon prototype.


🚀 What's next for Artifex

  • [ ] Add dynamic obstacle generation and massive multi-object manipulation goals (like self-assembling pyramids)
  • [ ] Introduce computer vision feedback so the agent can describe what it "sees" before acting, rather than purely relying on internal state data
  • [ ] Establish a Community Hub allowing users to save, replay, and share successful continuous-learning interaction weights

🙏 Built With & Sponsors

Artifex would not exist without these exceptional tools and platforms. Massive thanks to our sponsors:

Sponsor Role in Artifex
Google DeepMind The spatial reasoning brain — gemini-robotics-er-1.5-preview powers every autonomous planning and self-correction loop
Augment Code Full-monorepo agent-speed development — coordinated specs, isolated workspaces, and context that never expired
DigitalOcean Rock-solid App Platform — git push to live WebSocket production in under an hour
Lovable AI-scaffolded frontend — three-panel glassmorphic UI from design prompt to deployed app in minutes
assistant-ui Production-grade streaming chat UI — the perfect host for our live autonomous-learning logs

Built With

Share this project:

Updates