Working out from home can be difficult with no instructor. Especially with everyone beginning to work from home due to COVID-19, one of the biggest challenges people face is finding out how to stay in shape, right from their home. Fitto makes working out from home more enjoyable and encouraging by comparing your body positioning with a workout instructor on Youtube to help you effectively stay in shape!

What it does

Fitto is a web application that lets you join in on the virtual fitness hype by seeing how accurate your live body positioning is compared to trainers and instructors on Youtube. Simply paste and submit the Youtube video URL containing the workout video and see how well you line up with the instructor second-by-second by receiving instant feedback on your accuracy and body adjustments.

How I built it

The frontend is built with React, and PoseNet model. The client connects with the server-side socket for real-time event goodness, sending the user's PoseNet points captured from the webcam and receiving instant feedback regarding the user's positioning with respect to the instructor in the Youtube video.

The backend is built with Flask,, PyTorch and also PoseNet model. When the user submits a Youtube url, the backend preprocesses the video using the PoseNet model. When the frontend sends the user's PoseNet points, the backend returns an accuracy score based on the preprocessed points in real time. The app uses dynamic time warping for more accurate comparison between the user's PoseNet points and the Youtube instructor's PoseNet points.

The app is deployed using Google Cloud SQL as the database and hosted on Google Compute Engine.

Challenges we ran into

  1. The biggest challenge was handling the delay between the user's reaction time and the video's timestamp. We solved this by utilizing dynamic time warping which allows the app to measure the similarity regardless of latency.

  2. Another challenge was the limitation of PoseNet model which does not account for certain parts of the body. For example, if the user's hand is not within the camera frame, a portion of the PoseNet points are omitted thus marginally decreasing the accuracy of the model. In the case the user receives a low score, we alert the users that they can boost the accuracy by working out at a well-lit and spacious environment.

Accomplishments that we're proud of

  • Being able to connect the client-side socket to the server-side socket for real-time goodness
  • Working on a project with friends virtually for the first time

What we learned

  • How to use PoseNet model
  • How to use React
  • How to use GCP for deploying an app
  • Yoga workouts are actually vigorous!

What's next for Fitto

  • Complete mobile-web support
  • Leaderboard competitions to see who's the most accurate for different videos
  • Incentives for higher accuracy points
Share this project: