Inspiration

There are so many reasons a user could want to perform daily tasks virtually, whether it's because they're preoccupied or they're just too tired or sick. Secondly, Meta Quest VR has never been paired with robot arms before, and we decided to make it happen.

What it does

Cadence Labs turns a Meta Quest 3S into a real-time bimanual teleoperation rig for SO-101 robot arms, with synchronized dataset recording baked in.

The operator puts on the headset, holds their hands at a comfortable rest pose, and presses one key to anchor — this establishes a "logical home" pairing between hand space and arm space. From that moment on, relative hand motion maps directly to end-effector motion for each arm. The thumb-to-index pinch distance continuously controls the gripper. Two USB webcams record the scene while joint states and commanded actions are logged — all synchronized — into a LeRobot-format dataset ready for imitation-learning training.

How we built it

The system is a pipeline of small, independently testable stages:

  1. Hand streaming. The open-source hand-tracking-streamer app on the Quest broadcasts wrist pose and 21 hand landmarks over UDP at ~40 Hz.
  2. Listener and parser. A Python receiver binds 0.0.0.0:9000, drains the socket to stay on the latest frame, and emits a typed HandFrame per packet.
  3. Calibration engine. Our Logical Home Pairing module captures the wrist pose at anchor time and applies a scaled relative mapping: \( \vec{p}{\text{arm}} = \vec{p}{\text{home}} + s \cdot (\vec{p}{\text{hand}} - \vec{p}{\text{anchor}}) \) with scale \(s = 0.5\) so 10 cm of hand motion maps to 5 cm of arm motion — comfortable for long sessions. The target is then clipped to a ±30 cm cube of safe workspace around the arm's home pose.
  4. Inverse kinematics. ikpy solves for the five revolute joints of the SO-101 from its URDF. We re-use the previous frame's solution as the IK seed for stability, and do a forward-kinematics round-trip to report the achieved position error in real time.
  5. Gripper mapping. The pinch distance between thumb tip (landmark 4) and index tip (landmark 8) is linearly mapped: $$ g = \text{clip}!\left(\frac{d_{\text{pinch}} - 1.5\,\text{cm}}{8.0\,\text{cm} - 1.5\,\text{cm}}, \, 0, \, 1\right) $$
  6. Dataset recorder. Webcam frames and joint states are written on a common 30 Hz clock to a LeRobot-compatible directory structure.

Each stage has a standalone diagnostic (raw_probe.py, test_loopback.py, ik_sanity_check.py) so we could isolate failures without running the full pipeline.

Challenges we ran into

  • UDP silence with no error. The most time-consuming bug of the hackathon: the Quest streamer would occasionally stop sending packets the moment the headset was removed, without any error on either side. We ended up writing raw_probe.py — a listener that prints every incoming datagram and a heartbeat every second — to distinguish between a broken listener, a blocked firewall, client-isolation on the Wi-Fi AP, and a sleeping headset.
  • macOS application firewall per-binary allow. Each Python venv has a different resolved binary path, and macOS firewall rules are keyed to the resolved path. Rebuilding the venv silently broke ingress until we re-allowed the new binary.

Accomplishments that we're proud of

  • End-to-end bimanual VR teleoperation plus LeRobot-format dataset capture, working live in a weekend.
  • A disciplined set of diagnostic scripts that let us debug the network, listener, and IK stages independently — which turned into the project's biggest force multiplier when we were time-crunched at 2 AM.
  • Safety-first design even without the descoped hardware layer: workspace clamping, real-time joint-limit flagging, a 500 ms UDP watchdog, and a single-keystroke kill switch.

What we learned

  • UDP debugging on consumer Wi-Fi is about 80% network and 20% code — build the network probes first.
  • macOS's firewall caches Python binaries by resolved symlink path, which matters more than most tutorials admit.
  • ikpy handles URDF joint limits well but benefits hugely from seeding with the previous solution; cold-start IK produces visible jitter.
  • A good "logical home" formulation is worth more than a fancy coordinate-frame transform — scaling and anchoring in one place makes the whole system intuitive to operate.
  • When something is unclear, write the smallest possible isolating test (test_loopback.py, raw_probe.py) before touching the main loop.

What's next for Cadence Labs

  • Complete automation through prompting.
  • Expansion to more industries.
  • More training data.

Built With

Share this project:

Updates