Inspiration
Our inspiration came from personal experience. Building a variety of different projects, problems always arise during the hardware assembly phase, a bad solder joint here, a misidentified component there. Ariel was born as our answer to these pain points: a tireless, AI-powered co-pilot for soldering, PCB assembly, 3D printed part fitting, and everything in between.
What It Does
Ariel is a workbench assistant powered by computer vision technology with gesture and voice recognition. It can follow you around your workspace, hold items for you, record video, take pictures, illuminate hard-to-reach places, and identify components, all hands-free.
How We Built It
Using 3D printed parts, stepper motors, servo motors, and a BLDC motor, we assembled a 3-axis robot arm with 5 degrees of freedom. With encoders and limit switches we implemented full homing and calibration. A Qualcomm Rubik Pi 3 handles high-level AI and computer vision, communicating over serial with an Arduino Mega for low-level motor control. A workspace-mounted Logitech Brio 105 acts as Ariel's eyes, covering the entire bench.
For depth estimation from a monocular camera we used a homographic projection model. Given a known workspace plane, a pixel coordinate $(u, v)$ maps to real-world coordinates via:
$$\begin{bmatrix} X \ Y \ 1 \end{bmatrix} \sim H^{-1} \begin{bmatrix} u \ v \ 1 \end{bmatrix}$$
where $H$ is the homography matrix calibrated from known reference points on the bench surface.
Challenges We Ran Into
Computer vision: finding a model that accurately tracks hand landmarks and detects gestures without depth information was non-trivial. Standard models assume known scale; we had to compensate algorithmically.
Motor control & calibration: balancing simultaneous calibration and motion across multiple axes over a serial bridge between the Rubik Pi and Arduino introduced race conditions and timing issues that took significant debugging to resolve.
Accomplishments We're Proud Of
- Workspace mapping and monocular depth estimation from a flat-plane camera
- Full hardware integration: 3D printed structure, multi-motor kinematics, and end-effector (camera, light, fan) all communicating as one cohesive system.
- Successfully unifying three AI pipelines: Google Gemini, ElevenLabs TTS, and OpenCV/MediaPipe, into a single interactive, real-time environment.
What We Learned
- How to map a complex idea end-to-end from hardware to software under tight hackathon constraints.
- Deep practical knowledge of computer vision, edge AI, and the capabilities (and limits) of new hardware like the Rubik Pi.
- How to manage energy, prioritize ruthlessly, and keep shipping when the clock is running.
What's Next for Ariel
The hackathon was proof of concept, here is where it goes from here:
- Tool recognition & task guidance: using the workspace camera to identify components, flag polarity errors, and provide step-by-step overlays for assembly procedures.
- Force feedback & precision grasping: adding a proper end-effector gripper with tactile sensing so Ariel can pick up and position components autonomously.
- True depth perception: integrating a stereo camera or ToF sensor to replace the homography approximation and unlock $\mathbb{R}^3$ workspace awareness.
- Custom fine-tuned vision model: training on a dataset of bench components (resistors, ICs, connectors) for faster, more reliable identification than a general-purpose model.
- Cloud sync & session memory: logging every build session so Ariel can reference past work, catch repeated mistakes, and build a personal knowledge base over time.
Log in or sign up for Devpost to join the conversation.