🦾 Artifex: Conversational Robotics Engine
Bridge the gap between natural language and physical simulation. Artifex is a browser-based, AI-powered robotics laboratory that turns your spoken ideas into real-time physical actions — no hardware, no installs, no limits.
💡 Inspiration
Robotics development is notoriously difficult to get into. Between heavy dependencies, expensive physical hardware, and complex control interfaces, the barrier to entry is massive. We asked ourselves:
What if anyone could experiment with a state-of-the-art robotic arm right in their browser, using nothing but natural language?
We were inspired to merge the reasoning capabilities of modern Generative AI with the rigorous physics of MuJoCo — and to democratize robotics research by creating an interactive, frictionless environment where anyone could observe a robotic agent visually "think," "plan," and autonomously correct itself in real-time. By moving $50,000 of hardware dependencies directly into the browser, we are proving that the future of robotics prototyping is accessible to everyone.
🤖 What it does
Artifex is a fully interactive robotics simulation powered by LLMs. Users are presented with a tabletop scene containing colored cubes, a puck, a goalpost, and a highly accurate simulated Franka Emika Panda 7-DOF robotic arm.
Through a premium, glassmorphic conversational interface — designed to be clean and instantly presentation-ready — users simply type commands like:
"Push the puck into the goalpost."
The Core Innovation: Autonomous Continuous Learning
The magic of Artifex is its true autonomy. The AI controller doesn't blindly execute a script — it operates on a continuous learning loop:
- 📡 Reads the live rigid-body state from the physics engine
- 📐 Calculates the kinematic distance to target
- 🦾 Executes the physical action via the simulated arm
- 👁️ Observes the result — did the puck reach the goalpost?
- 🔁 Self-corrects — if it missed, the agent independently analyzes the coordinate delta, adjusts its approach vector, and retries
You watch this entire autonomous decision loop unfold in stunning, low-latency 3D physics directly in the browser — complete with real-time streaming logs of the agent's internal "thought" process.
🏗️ How we built it
Artifex is engineered as a highly scalable full-stack monorepo, divided into five tightly integrated layers communicating over WebSockets.
1. 🖥️ The Physics UI (Frontend)
Built with React and Vite. The initial UI shell was scaffolded in minutes using Lovable — AI handled the three-panel layout, dark glassmorphic theme, and component wiring so we could focus on real logic instead of boilerplate. The 3D physics engine is MuJoCo (Multi-Joint dynamics with Contact) compiled to WebAssembly, running heavy rigid-body calculations — inverse kinematics, continuous collision detection — purely client-side at 60fps.
2. 💬 The AI Chat Interface
The conversational layer is built on assistant-ui
— production-ready React components that gave us streaming chat, tool call rendering,
and state management out of the box. We extended it with a custom AgentBridge
runtime adapter to intercept WebSocket events and render the agent's live
continuous-learning logs as structured, auto-scrolling Markdown blockquotes directly
inside the chat thread.
3. 🧠 The AI Controller (Backend)
The brain of the operation is a FastAPI Python server powered by
Google DeepMind Gemini models — specifically
gemini-robotics-er-1.5-preview — for advanced spatial reasoning. The model
translates natural language intent into spatial coordinates \((x, y, z)\) and
calculates action trajectories using Euclidean distance:
$$ d = \sqrt{(x_{target} - x_{current})^2 + (y_{target} - y_{current})^2 + (z_{target} - z_{current})^2} $$
This distance metric feeds directly into the autonomous correction loop — if \(d > \epsilon\) after execution, the agent re-plans and retries.
The entire backend was built with Augment Code — a developer workspace where agents are coordinated, specs stay alive, and every workspace is isolated. Augment held full monorepo context across sessions, coordinated multi-file edits, and let our team ship complex backend features at agent speed.
4. ⚡ The Real-Time Bridge
The backend streams live execution states across a persistent WebSocket connection:
thinking → plan → executing → observing → correcting → result
Our custom @assistant-ui/react adapters catch every state transition and vividly
display the agent's self-correction process as structured UI blockquotes — making the
AI's reasoning fully transparent and auditable.
5. ☁️ Cloud Deployment
The entire monorepo is hosted on DigitalOcean App Platform.
DigitalOcean's straightforward platform made deploying a full-stack WebSocket monorepo
surprisingly painless — from git push to a live, secure production domain in
under an hour. We configured intelligent path-trimmed routing to serve both the
static Vite frontend and the dynamic Python backend over a single unified domain.
🧗 Challenges we ran into
| Challenge | What Happened | How We Solved It |
|---|---|---|
| Deployment Routing Conflicts | UI and Backend competed for the root web path | Engineered DigitalOcean path-trimming (e.g. /artifex-agent) so FastAPI correctly caught ws:// upgrade requests without a 404 |
| Package Manager Clashes | Upgrading to the speedier uv bundler caused cloud buildpack failures |
Pinned the exact Python 3.12.8 runtime inside the container and purged all legacy requirements.txt files |
| UI State Hydration | Instantaneous WebSocket streams caused rendering jank in React | Built a custom AgentBridge adapter to safely buffer and format the agent's continuous-learning thought-logs into smooth, auto-scrolling UI elements |
🏆 Accomplishments that we're proud of
🔁 Autonomous Correction Loop — The agent's ability to observe a missed physics interaction (like missing the goalpost), analyze the coordinate delta, and continuously self-correct without any human intervention is a genuine demonstration of reliable LLM-driven autonomy in a physical environment.
🌐 Zero-Install Physics — MuJoCo running natively in the browser via WebAssembly means anyone can try Artifex instantly — no Linux environment, no ROS stack, no gigabyte downloads.
✨ Premium User Experience — The presentation-ready chat panel, dynamic suggestion cards, Markdown-formatted learning logs, and dark glassmorphic styling make Artifex feel like a next-generation product, not a hackathon prototype.
🚀 What's next for Artifex
- [ ] Add dynamic obstacle generation and massive multi-object manipulation goals (like self-assembling pyramids)
- [ ] Introduce computer vision feedback so the agent can describe what it "sees" before acting, rather than purely relying on internal state data
- [ ] Establish a Community Hub allowing users to save, replay, and share successful continuous-learning interaction weights
🙏 Built With & Sponsors
Artifex would not exist without these exceptional tools and platforms. Massive thanks to our sponsors:
Built With
- assistant
- assistant-ui
- digitalocean
- fastapi
- google-gemini
- lovable
- mujoco
- python
- react
- tailwind-css
- typescript
- ui
- uv
- vite
- webassembly
- websockets
Log in or sign up for Devpost to join the conversation.