Inspiration
What it does
How we built it
Challenges we ran into
Accomplishments that we're proud of
Post-Touch — Project Story Inspiration Touch was never the goal. Touch was just the closest proxy we had to the hand — a single point of contact that throws away orientation, speed, shape, and intention.
The question that started this project: what if a canvas could see your whole hand instead of just where it lands?
How I Built It The stack is deliberately minimal: Next.js, HTML5 Canvas, MediaPipe Tasks Vision, and the Claude API. No game engine. No 3D library. Just a webcam, a 2D canvas, and math.
Hand tracking runs at 30fps using MediaPipe's HandLandmarker in VIDEO mode, detecting 21 landmarks per hand. Every gesture maps to a specific landmark configuration.
Gesture detection is purely geometric, no ML classifier on top. Pinch is detected when the Euclidean distance between thumb tip and index tip falls below a threshold:
$$d(\text{thumb}, \text{index}) = \sqrt{(x_4 - x_8)^2 + (y_4 - y_8)^2} < 0.07$$
Brush rotation uses the knuckle vector from landmark 5 (index MCP) to landmark 17 (pinky MCP), the line across the back of the hand that rotates directly with wrist roll:
$$\theta = \arctan!\left(\frac{y_{17} - y_5}{x_{17} - x_5}\right)$$
Accumulated delta past $\pm 0.42$ rad (~24°) triggers a brush switch.
The erase gesture (open palm + swipe) required the most care. An open palm also triggers the "clear canvas" countdown, so I needed to distinguish deliberate wipes from an accidental still palm. The solution is an exponential moving average of palm speed with heavy inertia:
$$v_t = 0.80 \cdot v_{t-1} + 0.20 \cdot |\Delta p_t|$$
Only when $v_t > 22$ canvas px/frame does the eraser engage. Brief flicks can't spike past it.
The eight brushes each have their own renderer. Glow uses radial gradients with shadow blur. Blossom uses Bézier-curve petals with a seeded RNG so each flower looks identical on every animation frame. Electric uses recursive midpoint displacement to generate fractal lightning. Watercolor uses six stacked semi-transparent ellipses per step with hue drift and edge granulation dots.
Color selection is gesturally native. Pinch to lift the brush, then slide your hand over the palette strip that appears at the top of the canvas. The leftmost zone resets to natural ink; the rest maps $x$-position linearly to HSL hue across $[0°, 360°]$.
Claude receives the canvas as a base64 PNG and returns a poetic title and a two-sentence interpretation, treating the painting as a found object rather than evaluating it.
Challenges The erase/clear collision was the hardest design problem. Both use an open palm. The key insight was that "hold still" and "wipe" have very different velocity signatures over time, and an exponential filter with high inertia makes that difference robust.
Brush rotation reliability was broken until I switched landmarks. I originally used wrist to middle MCP, which tracks the hand tilting forward and back, not wrist roll. Switching to index MCP to pinky MCP fixed it immediately, because that vector rotates in exactly the plane of wrist roll.
Keeping animation state out of React was a constant discipline. The RAF loop runs at 60fps inside a useEffect with empty deps. Everything it reads — brush index, background, flower state, star positions — lives in refs. Any accidental closure over stale state caused subtle bugs that were hard to trace.
What I Learned That the hardest part of gestural interfaces isn't detection, it's disambiguation. Hands are always doing something. The challenge is deciding what counts as intent.
And that Claude as an interpreter changes what it means to finish a painting. The work isn't done when you put down the brush. It's done when something reads it.
What we learned
What's next for post-touch
Built With
- claude-api-(claude-sonnet-4-6)
- fonts
- html5-canvas-api
- mediapipe-tasks-vision
- next.js
- tailwind-css
- typescript
Log in or sign up for Devpost to join the conversation.