*SMARTArm — Voice & Vision Controlled Robotic Manipulation Inspiration. *

We asked ourselves a simple question:

why are robotic arms still so hard to control? Industrial arms need specialized programming. Consumer arms need clunky apps. What if you could just talk to a robot and it understood, or move your hand and it followed? At EagleHacks, with sponsors like ElevenLabs, Blue Sparq, and Arthrex in the room — companies building voice AI, IoT hardware, and precision medical devices — we saw the perfect opportunity to bridge the gap between human intent and robotic action. We wanted to build something you could demo in 60 seconds and immediately understand: tell it what to grab, and it grabs it. The space theme sealed it. If we're sending robots to Mars, they need to understand voice commands, see their environment, and communicate back — autonomously. SMARTArm is our prototype for that future.

What It Does: SMARTArm is a robotic arm system with three control modes:

Voice Mode — Say "pick up the red block." The system uses computer vision to locate the block by color, calculates the arm trajectory, and executes a smooth pickup sequence. ElevenLabs AI voice narrates every step back to you. Hand Tracking Mode — A webcam tracks your hand using MediaPipe. Move your hand and the arm mirrors you in real-time. Pinch your fingers and the gripper closes. Open your hand and it releases. Stacking Mode — The arm autonomously picks up blocks and stacks them on top of each other, adjusting height with each placement.

Throughout all modes, ElevenLabs voice AI acts as the robot's voice — confirming commands, describing what it sees, and reporting what it's doing.

How We Built It: Hardware:

  • LewanSoul LeArm 6DOF robotic arm with LDX-218 servos and LSC-6 controller
  • ESP32 DevKit flashed as a USB-to-TTL serial bridge (9600 baud)
  • External USB webcam positioned to view the workspace
  • Mac laptop running all Python code

The Communication Chain: Mac (Python) → USB → ESP32 (bridge) → TTL Serial → LSC-6 Controller → 6 Servos The ESP32 runs a simple transparent bridge — every byte Python sends over USB goes straight to the LSC-6 servo controller. This gave us reliable, low-latency wired control without fighting Bluetooth pairing issues.

Software Stack:

  • Python — core language for everything
  • OpenCV — HSV color detection for identifying colored blocks
  • MediaPipe — real-time hand tracking (21 landmarks at 30fps on CPU)
  • ElevenLabs API — text-to-speech voice output using the Flash v2.5 model
  • speech_recognition — Google STT for voice command input
  • Pyserial — serial communication to the arm via ESP32

The Key Challenge:

Camera Pixels → Arm Positions The hardest engineering problem was mapping what the camera sees (pixel coordinates) to where the arm needs to move (servo positions). We built a grid calibration system — placing blocks at known positions, recording the exact servo values to reach each spot, and interpolating between calibration points at runtime. This approach is inspired by how industrial pick-and-place robots actually work — they're taught positions through calibration, not calculating inverse kinematics on the fly.

Serial Protocol: The LSC-6 uses a binary protocol where every command starts with 0x55 0x55 followed by length, command ID, and parameters. Servo positions range from 500-2500 with 1500 as center. We wrote a Python controller class that abstracts this into simple calls like arm.move_servo(BASE, 1800, 1500) — moving the base to position 1800 over 1500 milliseconds for smooth, cinematic motion.

Challenges We Faced: Power supply debugging — The arm kept beeping on startup. We learned the hard way that a 9V battery is both too high voltage (max 8.4V) and too low current (servos need 3-5A). We switched to a proper rechargeable battery pack.

Serial connection on Mac — The LSC-6's Bluetooth module uses BLE, which doesn't create a standard serial port on Mac. The micro-USB port needed a CH340 driver. We ultimately used the ESP32 as a serial bridge — a solution that ended up being more reliable than either direct connection method. Single-camera depth limitation — Our camera could see X and Y perfectly but couldn't judge depth along its viewing axis. For stacking, this meant the arm would align on one axis but miss on the other. We solved this by using fixed preset positions for placement zones rather than relying purely on vision. Gripper precision — 20mm blocks with a hobby servo gripper require precise positioning. Servo backlash and arm flex meant the gripper wasn't always exactly where we commanded. We added a visual correction step — detect, move close, detect again, fine-adjust, then grasp. Movement smoothness — Fast servo commands made the arm jerk violently. We enforced minimum 1500ms duration on all movements and used coordinated multi-servo commands so joints move simultaneously, creating fluid motion.

What We Learned:

Calibration beats computation — For a constrained workspace, a lookup table of calibrated positions is faster to build and more reliable than solving inverse kinematics. Hardware debugging is 60% of robotics — More time went into power supply issues, serial connections, and driver installation than writing the actual control code. Voice makes robots accessible — Adding ElevenLabs voice transformed the project from a tech demo into something anyone could interact with. Non-technical people at the hackathon walked up and talked to the arm naturally. Ship the demo, not the theory — We could have spent all 48 hours on perfect camera calibration and IK solvers. Instead, we built something that works reliably and impresses in 90 seconds.

Industry Applications — Where SMARTArm Goes Next:

We didn't build SMARTArm in a vacuum. We built it with real industries in mind — industries represented right here at EagleHacks by the sponsors who inspired this project.

COMPANY CHALLENGES LIST ArthrexBlueSparq - Eightpoint -

Arthrex is a global leader in minimally invasive orthopedic devices, headquartered right here in Naples. In the operating room, surgical instrument tracking is a critical patient safety problem — retained instruments cause serious complications and cost hospitals millions. SMARTArm's core architecture — vision-based identification, precise robotic manipulation, and voice-confirmed actions — maps directly to automated surgical tray management. Imagine a system that visually identifies each instrument, tracks what's been used, and confirms returns by voice: "Retractor returned. 11 of 12 instruments accounted for." Arthrex's own hackathon history shows they value exactly this kind of hardware + software integration — their past winning teams built camera-based security systems using Raspberry Pi and Arduino, the same kind of embedded-to-computer pipeline we built with ESP32 and Python. Beyond instrument tracking, voice-controlled robotic assistance could support surgeons during procedures, hands-free — an area where Arthrex's expertise in medical education and minimally invasive technology intersects perfectly with what SMARTArm demonstrates.

Blue Sparq — IoT-Connected Industrial Automation:

Blue Sparq, based in Cape Coral, specializes in IoT-connected machine control systems with web and mobile dashboards — exactly what SMARTArm is at its core. Our system is a connected robotic device controlled via serial protocol, monitored through a live dashboard, and operated remotely through voice and vision. This is the same architecture Blue Sparq builds for commercial clients: embedded controllers talking to cloud dashboards, with human operators interacting through intuitive interfaces. SMARTArm demonstrates that affordable, accessible robotic manipulation doesn't require six-figure industrial arms. A $60 hobby arm, an ESP32, a webcam, and smart software can perform reliable pick-and-place operations. For Blue Sparq's clients in manufacturing and commercial equipment, this approach could bring automated sorting, quality inspection, and assembly assistance to small operations that can't afford traditional industrial robotics.

Eightpoint — User-First Product Design & AI-Driven Experiences:

Eightpoint builds consumer products trusted by millions — from weather radar apps to the Wave Browser's ocean cleanup initiative. What makes their products work is the user experience: intuitive, accessible, and driven by AI personalization. SMARTArm embodies that same philosophy. You don't need to learn a programming language or read a manual to use it. You talk to it. You wave at it. It talks back. The voice layer powered by ElevenLabs transforms a technical robotics demo into something anyone can interact with naturally. Eightpoint's focus on building products that "enrich everyday lives" through thoughtful design and data-driven development aligns with where SMARTArm is headed — making robotic assistance as natural as having a conversation.

What's Next?:

Stereo vision — We have a dual-lens stereo camera module ready to add true depth perception for 3D workspace mapping Inverse kinematics — Replace calibrated presets with computed trajectories for arbitrary pick positions ElevenLabs Conversational AI Agent — Upgrade from simple TTS to a full conversational agent for natural back-and-forth dialogue Multi-arm coordination — Scale the architecture to control multiple arms collaboratively Scaling to real-world deployment — Warehouse pick-and-pack for Blue Sparq's IoT clients, surgical instrument management for Arthrex's OR environments, and accessible consumer robotics with Eightpoint's user-first design philosophy

Built With

+ 1 more
Share this project:

Updates