Inspiration

We were inspired by the gap between digital intelligence and physical labor. While AI can explain a circuit diagram, it can't hand you the soldering iron when your hands are tied, so we built a mobile, autonomous robot designed to bridge that gap in a workshop environment.

Technology Stack

  1. Robot chassis - Rover with L298N motor driver, controlled via Raspberry Pi GPIO
  2. Claw arm - Arduino-controlled servos with inverse kinematics, ultrasonic sensor
  3. Camera - External webcam for overhead vision on the Mac, plus on-board Pi webcam for arm guidance
  4. ArUco markers - 4x4_50 dictionary, IDs 0-3 for grid corners, ID 4 on the robot
  5. Raspberry Pi 5 - Voice client, rover control, arm control
  6. Whisper ASR, LLM (Gemini), TTS (ElevenLabs) - Conversation mode
  7. YOLO + ArUco - Vision and path routing

Languages

Python C++

Frameworks and libraries

  1. FastAPI - Rover control API on the Pi
  2. OpenCV - ArUco marker detection, homography transforms, image processing, camera capture
  3. Ultralytics (YOLO) - YOLO11n object detection
  4. faster-whisper - Whisper ASR backend
  5. ElevenLabs REST API - Text-to-speech
  6. google-genai - Gemini 2.5 Flash for scene interpretation and chat

Tools

Cursor, Fusion360, 3D printers, Soldering, Circuitry

Product Summary

An AI Agent loop controls the rover and claw to bring tools to you while your hands are tied. DeskClaw has a mic and speaker to actively speak to it to get help on projects or bounce ideas. Our goal in this project is to give AI a physical body to have real world impact. We originally wanted to have OpenClaw controlling the arm but ran into many issues later on regarding limited compute with Raspberry Pi and limited support.

AI use:

Yes, over 70% of our code was generated by AI.

Built With

Share this project:

Updates