Inspiration
Robots can’t talk to each other. Most robots use their own command format, so researchers and companies spend hours building brittle bridges before they can even prototype a task. We wanted one pipeline that starts with a multimodal prompt (text, images, or raw sensor frames) runs that through Gemini’s Live API, and comes out the other side as a single motion plan every robot can understand. The result is a shared message schema in JSON, a MuJoCo-validated planner, and Weave as the observability backbone. For engineers, it is helpful when planning tasks to get full depth analysis on the output and process of embodied agents.
What it does
- Prompt / Multi-Modal Input – RoboWeave accepts text, live webcam frames, depth maps, IMU streams (whatever the robot “sees.”)
- LLM reasoning – Gemini Live API ingests that multimodal bundle and returns a compact JSON for Weave and planning.
- Motion planning – The system extracts the decided function and passes it on to MuJoCo, which knows how to deal with simple instructions like
forward()orrotate()or evenflip()with set parameters. - Weave Integration - Full observability on every action and task of the embodied agent.
How we built it
- Started with a unstructured command. Gemini is the brains and every high level decision relies on its reasoning.
- Forced Gemini 1.5 Pro Live API to output only valid JSON that both our program, and Weave can parse. So, no post-processing.
- Used MuJoCo as the last check before real robots. We worked only within sim since we didn't have a robot to test RoboWeave on.
- Built a React Flow interface that shows prompts, nodes, and motions end-to-end.
- Connected up Weave for traceability so every step (prompt, LLM call, plan, function to MuJoCo) shows up in one timeline.
Challenges we ran into
- Gemini sometimes drifted from the schema or was cut off, making Weave confused. Inline validation and prompt tweaks fixed that.
Accomplishments we’re proud of
- Google Gemini's Live API controlled a simulated GO2 in MuJoCo.
- End-to-end latency sits at 1.9 s using just the free Gemini tier.
- Entire repo is compact and optimized.
What we learned
- Full-path logs through Weave are super useful since many “AI” bugs are really visibility bugs.
- Make your MCP tools as transparent as possible for better results when working with agents.
What’s next for RoboWeave
- Replace MuJoCo with live Unitree GO2s on ROS 2.
- Feed RGB-D video through the same loop for instant obstacle re-planning.
- Open-source the Gemini spec plus a client so other builders can use our system.
- Stress-test a multi-robot fleet to perform a complicated task
One unstructured prompt → one JSON plan → many robots, no in-between scripts. That’s the goal.
Built With
- bezier-spline-motion-library
- docker
- eslint-+-prettier
- github-actions
- google-gemini-1.5-pro-live-api
- json-taskgraph-schema
- model-context-protocol-adapter
- mujoco-3d-engine
- pyside6
- python-3.10
- react-18
- react-flow
- tanstack-router
- typescript-4
- unitree-go2-urdf
- unocss
- vite
- weights-&-biases-weave
- zustand

Log in or sign up for Devpost to join the conversation.