KinetiCare

post-session summary
intake survey - pt. 1
intake survey - pt. 2
intake survey - pt. 3
tree pose - correct
tree pose - incorrect

Inspiration

A single PT session costs $100–$400, waitlists stretch weeks, and 70% of patients drop off home exercise programs because no one's watching. We asked: what if your webcam could be your physical therapist — one that actually sees you, scores your form in real time, and adapts to your body?

What it does

KinetiCare is an agentic AI physical therapy coach that runs entirely in the browser. You describe your injury and goals. A planning agent builds a personalized exercise protocol. Then you work out — your webcam tracks 33 body landmarks at 30fps, scores form against your own calibrated baseline (not generic targets), and a coaching agent delivers real-time spoken feedback based on joint angles, fatigue, facial expression, and live screenshots. Everything is hands-free via voice. After your session, a report agent generates a PT-grade clinical note with compensation patterns, ROM data, and escalation flags.

How we built it

Frontend: Next.js 14, React 18, TypeScript, Tailwind — fully client-side for privacy
Pose detection: MediaPipe Pose Landmarker — 33 landmarks on-device at 30fps with body segmentation
Calibration: Three reference poses capture your actual joint angles so scoring is relative to your biomechanics
Coaching agent: Gemini 2.5 Flash processes joint data, fatigue estimates, facial expression, and a webcam frame every 8s to generate targeted corrections
Voice: ElevenLabs TTS with five coach personas (drill sergeant, Gen Z hype beast, British physio, scientist, golden retriever) + speech recognition
Planning + Report agents: Intake → sequenced exercise protocol; post-session → structured PT note with escalation flags

Challenges we ran into

Pose scoring that works. Generic angle thresholds ignore body proportions. We solved this with per-user calibration — three reference poses, all scoring relative to your baseline. Hardest problem, biggest payoff.

Real-time orchestration. MediaPipe, Gemini calls, TTS, voice recognition, and React state all running simultaneously. Timing the coach so it doesn't talk over itself or flicker the UI required careful async queue management.

Making the AI not annoying. Early versions corrected every frame. We tuned to 8-second intervals, added voice-busy detection, and gave each persona distinct prompting so feedback feels human.

Accomplishments that we're proud of

Per-user calibration that scores against your body, not generic targets
Five coach personas that genuinely sound and behave differently
PT escalation flags — the AI knows when to tell you to see a real therapist
Zero data leaves the browser — pose detection is fully on-device
Full end-to-end flow (intake → plan → calibration → coaching → report) in under 5 minutes

What we learned

The gap between "pose detection demo" and "useful coaching tool" is almost entirely about calibration and timing — not the AI models. Making the system feel responsive, personal, and trustworthy was more UX problem than engineering problem. Gemini's multimodal capabilities (joint data + screenshots simultaneously) unlock coaching quality impossible with pose data alone.

What's next for KinetiCare

Progress tracking — longitudinal ROM trends across sessions
Form-gated rep counting — reps only count when form is correct
PT portal — real therapists review flagged sessions and adjust protocols remotely
Mobile support — stack is browser-based, so it's a responsive UI problem
Insurance-compatible reporting — align PT notes with CPT billing codes

Built With

nextjs

Updates

darkblue0555 Cheng started this project — Apr 12, 2026 03:59 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.