Inspiration

Every athlete faces rehab at some point. Whether it's a torn ligament, a sprain, or post-surgery recovery, the process is gruelling, lonely, and easy to get wrong. We've experienced this firsthand through kickboxing, BJJ, and track & field. The biggest problems? People either skip their exercises, do them with poor form (risking re-injury), or push through fatigue when they should stop. Physical therapists can't be there for every rep, and most rehab apps are just glorified checklists with no feedback.

We wanted to build something that actually watches you move, tells you when your form is slipping, and knows when you've had enough.

What it does

RehabFlow is a gamified rehab companion that:

  • Generates a personalised rehab plan based on your injury, mobility level, and pain threshold using an open-source LLM (Llama 3.3 70B via Groq), or parses a plan uploaded by your doctor
  • Tracks your exercise form in real-time using MediaPipe BlazePose (33 body landmarks) and computes joint angles (knee flexion, hip flexion, trunk lean, knee valgus)
  • Scores every rep against ideal form ranges on a 0-100 scale
  • Detects fatigue-driven degradation by comparing your current form against a personal baseline established from your first 3 reps
  • Counts reps automatically using angle-threshold state detection
  • Gamifies recovery with XP, levels, and achievements to keep you consistent

All video processing happens on-device. No frames ever leave your machine.

How we built it

Computer Vision Pipeline

The core pipeline runs in real-time on every webcam frame:

$$\text{Webcam} \xrightarrow{\text{MediaPipe}} \mathbf{L} \in \mathbb{R}^{33 \times 4} \xrightarrow{\text{angles}} \boldsymbol{\theta} \xrightarrow{\text{score}} s \in [0, 100] \xrightarrow{\text{fatigue}} d$$

where $\mathbf{L}$ is the landmark matrix (33 joints $\times$ 4 values: $x, y, z, v$), $\boldsymbol{\theta}$ is the vector of joint angles, $s$ is the form score, and $d$ is the fatigue drift.

Joint angles are computed using the three-point angle formula:

$$\theta = \arccos\left(\frac{\vec{BA} \cdot \vec{BC}}{|\vec{BA}|\ |\vec{BC}|}\right)$$

where $B$ is the joint vertex and $A$, $C$ are the adjacent landmarks. For example, knee flexion uses hip--knee--ankle.

Form scoring is deterministic: each joint angle $\theta_i$ is compared against an ideal range $[\theta_i^{\min}, \theta_i^{\max}]$ defined per exercise phase. The per-joint score is:

$$s_i = \max\left(0,\ 100 - 2 \cdot \max\left(0,\ \theta_i^{\min} - \theta_i,\ \theta_i - \theta_i^{\max}\right)\right)$$

The overall rep score is $s = \frac{1}{n}\sum_{i=1}^{n} s_i$.

Fatigue detection uses baseline drift. The first 3 reps establish a personal baseline $\bar{s}_{\text{base}}$. Each subsequent rep is compared:

$$d = \frac{\bar{s}{\text{base}} - s{\text{current}}}{\bar{s}_{\text{base}}}$$

A drift $d > 0.15$ triggers a moderate alert. A drift $d > 0.30$ recommends stopping to prevent injury.

Rep counting uses a simplified angle-threshold state machine. For a squat, we track knee flexion $\theta_k$:

$$\text{state} = \begin{cases} \texttt{down} & \text{if } \theta_k < 130° \ \texttt{up} & \text{if } \theta_k > 160° \end{cases}$$

A completed rep is registered on the transition $\texttt{up} \to \texttt{down} \to \texttt{up}$.

ML Models

Three Random Forest classifiers trained on the InfiniteRep synthetic dataset ($n = 4000$ frames, squat + push-up). Features: 51 raw landmark coordinates + 8 engineered joint angles = 59 features. Validated with GroupKFold ($k = 5$, grouped by video ID to prevent data leakage).

Model Task F1 Score
Exercise Classifier squat vs push-up $0.9995 \pm 0.001$
Phase Detector up vs down $0.8962 \pm 0.009$
Form Quality good vs poor $0.8643 \pm 0.034$

Form quality labels were engineered via z-score deviation from per-exercise angle means. Frames in the top 30% of deviation were labelled "poor":

$$z_i = \frac{\theta_i - \mu_{\text{exercise}}}{\sigma_{\text{exercise}}}, \quad \text{label} = \begin{cases} \texttt{poor} & \text{if } |z_i| > z_{0.70} \ \texttt{good} & \text{otherwise} \end{cases}$$

LLM Integration

Groq API running Llama 3.3 70B Versatile generates structured RehabPlan JSON from a UserProfile containing age, injury type, sport, mobility level, and pain threshold ($1$--$10$). A retry loop (3 attempts, exponential back-off) and pre-generated fallback plans ensure reliability.

Tech Stack

Layer Technology
Frontend Next.js 14, React 18, TypeScript, Tailwind CSS
Backend FastAPI, Uvicorn, Pydantic v2
Vision MediaPipe BlazePose, OpenCV
ML scikit-learn (Random Forest), joblib
LLM Groq API (Llama 3.3 70B Versatile)

Challenges we faced

  • MediaPipe version incompatibility: mediapipe==0.10.32 drops mp.solutions.pose in favour of the new Tasks API, and requires NumPy 2.x. We pinned mediapipe==0.10.9, numpy==1.26.4, and matplotlib<3.9
  • React re-renders killing the camera: State polling every $1.5\text{s}$ caused the CameraFeed component to remount, destroying the MediaStream. We moved the stream to a global ref outside React's lifecycle so it survives re-renders
  • Rep counting at low FPS: The phase-matching state machine required $\geq 3$ frames per phase, but a squat completed in $\sim 3$ total frames. We replaced it with the angle-threshold approach described above
  • Synthetic-to-real gap: Models trained on synthetic COCO keypoints needed threshold tuning to handle noisy real webcam input

What we learned

  • Synthetic training data can achieve $F_1 = 0.9995$ on exercise classification, but bridging to real webcam input requires careful threshold calibration
  • GroupKFold validation is essential when frames from the same video are highly correlated ($\rho > 0.95$)
  • Browser camera APIs are fragile in React. Moving state outside the component lifecycle was the key architectural insight
  • Fatigue detection via baseline drift is simple but effective. You don't need a complex temporal model, just measure $d = (\bar{s}{\text{base}} - s{\text{current}}) / \bar{s}_{\text{base}}$

What's next for RehabFlow

  • Expand to more exercises (bicep curls, shoulder press, lunges)
  • Train on real webcam data collected during sessions to close the synthetic-real gap
  • Add a neural network comparison alongside Random Forest
  • Integrate wearable data (heart rate, accelerometer) for multi-modal fatigue detection
  • Build social features for accountability and community support

Built With

Share this project:

Updates