About the Project

What Sparked This

One of our teammates used to train for competitive dance out of a shared bedroom. No mirror big enough, no coach, no way to know if her posture was actually improving or if she was just getting more confident about doing it wrong.

We kept coming back to that. The people who need technical feedback the most are exactly the ones who can't afford it. A $250/hour coach. A studio slot that fills up in minutes. These aren't small obstacles they're the whole reason most people quietly give up.

We wanted to build something that actually closes that gap. Not a gimmick, not a prototype that works once on a good laptop. Something you could genuinely open on your phone in a cramped apartment and get real, specific, useful feedback from.

How We Actually Built It

We started with the skeleton tracking. MediaPipe BlazePose runs entirely in the browser via WebGL no server, no upload, no waiting. That felt important to us. Your footage shouldn't have to leave your device for you to get feedback on it.

For any joint we track say the knee between hip A, knee B, and ankle C we compute the angle using basic vector geometry:

$$\vec{u} = A B, \quad \vec{v} = C B$$

$$\theta = \arccos\left(\frac{\vec{u} \cdot \vec{v}}{|\vec{u}| \cdot |\vec{v}|}\right)$$

That gives us a real anatomical angle on every frame. Stack those across a full performance and you have something meaningful to analyze.

The vocal side uses the Web Audio API to pull frequency data in real time. We measure how far off each note lands from its target in cents 100 cents to a semitone using:

$$\text{deviation} = 1200 \times \log_2\left(\frac{f_{\text{actual}}}{f_{\text{target}}}\right)$$

Then we average those deviations across the whole session to generate the final vocal score rather than just reporting the last few seconds, which would be misleading.

Once all that raw telemetry is compiled joint coordinates, pitch drift, timing we ship it to Gemini and it comes back as a real written critique. Not a score out of ten. Actual sentences explaining what went wrong and where.

The Parts That Broke Us

1. The Asynchronous Frame Race

The hardest bug we hit was invisible for a long time. When scanning a prerecorded video, the video element seeks frame by frame while BlazePose processes each one asynchronously. The problem is they don't wait for each other.

The seek loop would advance to frame 12 before the model finished analyzing frame 8, and we'd get empty landmark arrays which the scoring logic would interpret as a body completely out of frame. Scores were tanking for no visible reason.

The Fix: We made the seek loop block on a promise that only resolves when the model fires its callback. So frame k+1 only gets loaded once inference on frame k is confirmed complete. Simple idea, took embarrassingly long to land on it.

2. The Video Scaling Nightmare

Webcam footage comes in at a predictable aspect ratio. Uploaded videos don't. Someone records a vertical clip on their phone, we stretch it to fill the container with cover, and suddenly the feet the most important part of a dance critique are cropped out entirely.

We added a conditional contain mode for uploaded files that letterboxes the video and keeps the canvas overlay synchronized with it.

What We Took Away

  • ClientSide Power: Running a dense pose estimation model clientside in a browser is genuinely practical now. Two years ago this felt like a stretch goal. Today it works on a midrange laptop without breaking a sweat.
  • Data vs. UX: Raw jointangle numbers mean nothing to a dancer. But the same data rendered as glowing skeleton lines directly on their own footage that lands. Presentation is part of the product.
  • Intentional Community Design: Downvote buttons in creative spaces don't produce useful feedback, they just suppress people who are still learning. We built the voting system to prevent that specifically single vote, swappable direction, consensusweighted rather than raw count.

Mostly though, we just wanted to build something that would have actually helped. We think it does.

Built With

Share this project:

Updates