Programmers are known to talk to rubber ducks as a means of reasoning about their code. Our AI-powered mechanical rubber duck not only hearkens to your predicaments, but also autonomously fixes your code by pecking at your keyboard with its massive bill.
We were interested in the idea of stress-testing LLMs in the physical world. The rubber duck, whose bill is too wide for hitting one key at a time, has to constantly correct what it has typed (while making more errors to be corrected), and adjust its configuration for path planning to optimize for upcoming key strokes. We gave the duck a grumpy temper, so it is quacks and complains while begrudgingly work on its Sisyphean task.
We're essentially yanking one of the links in the familiar "vibe coding" workflow into the physical world, bringing AI out of its digital "comfort zone". While a quacking pecking mechanical duck might be a somewhat whacky tool for its job, we think it is a good metaphor for asking the deeper question: While humans are great adapters to our ever-changing surroundings, we can subconsciously adjust, re-plan, and error-correct our motions on-the-fly in response to realtime perception, can we build an AI to do the same?
For the rubber duck’s neck, we designed a four-bar linkage, which gives us two degrees of freedom — the duck can extend and retract its head as well as peck downwards, all powered by a pair of servo motors inside the duck’s body. This elegant design saves us the trouble for having separate mechanisms each degree of freedom, but significantly more math is involved to solve for the motor’s output angles from the desired position of the duck’s head — a notorious problem known as inverse-kinematics (IK). The first obstacle is that the servos only have an 180 degrees sweep, and if the domain of the two servos perfectly overlap, we actually lose a lot of range. Therefore, we wrote a web-based simulation, and found out the optimal overlap between the domains to maximize the range we need for covering the entire keyboard. The second problem is to solve for the IK itself. While solving for the tip of the four-bar linkage is trivial, solving for the tip of the ducks bill (a fixed extension on one of the four-bars), turned out unexpectedly more gnarly. We employed a little help from Claude code (as well as some domain-knowledge of Leo McElroy, our friend and master of constraint solvers), to come up with a compact solution using circle intersections. We improved the solver to respect chirality — so that the linkage cannot “flip backwards” or collapse. In the end, we have a dexterous duck neck can deliver the duck bill to any position desired within range, with all the solver code fit into the microcontroller.
Another software engineering challenge we faced was due to the sheer number of components that need to work together (mechanics, sensors, sound, microphone, text editor, AI integration…) — and in consequence, the distribution of work among the teammates so that everyone has something to work on, in a way that can be integrated in the end. We designed an HTTP-based inter-process-communication (IPC) system where a central HTTP server is set up as a mastermind. Each component is a client program (which may or may not spin up its own server) polls the central server for “jobs”, or things that needs doing, with an HTTPS request, and when it’s done, sends back the result via another HTTPS request. The central server pulls everything together, figures out which component to do what and when, and vitalizes a quacking, quipping, and pecking duck that all of us contributed to.
Built With
- arduino
- claude
- elevenlabs
- esp32
- gemini
- javascript
- mg996r
- pro-micro
Log in or sign up for Devpost to join the conversation.