My sister got her yoga teacher's certificate during the pandemic, and it was incredibly frustrating for her to have to learn from video where the instructor may have a more difficult time correcting her pose. Therefore, I thought it would be cool to make an application that could allow the student's and instructor's poses to be matched.
What it does
This is web application that has two simple steps: Upload a video of an instructor, and a student video, and using 3D pose estimation as well as some angle matching algorithms, processes the videos and displays the student/instructor pose overlaps as well as the error between the instructor and student for each frame.
How I built it
The frontend is built in React, and it talks to a flask API. VideoPose3D is the pose estimation library I chose to you use because of its success in getting 3D joint coordinates using video data. Once poses across all frames for a student and instructor are computed, I then do some simple linear algebra using Pytorch tensors to compute the angles across all adjacent limbs for both the instructor and student. Then I compute the angular difference to display the overall error to the user.
Challenges I ran into
Building the app may have been challenging, but writing the tutorial was even harder. I wanted to ensure that I was as clear as possible, and I didn't want to just have giant blocks of code in my tutorial without properly explaining them. I guess I underestimated the writing component, and I gave myself a lot less time than I should have had to write.
Accomplishments that I'm proud of
I'm really proud that I was able to finish everything I did in the time that I had. A team-mate that I began the competition with realized that they did not have as much time to commit as they thought, which resulted in me having to double my work-load; however, although it was tough, it was incredibly rewarding!
What I learned
- How to be an effective writer by structuring all of the points I want the reader to learn throughout the tutorial
- I got to learn all of pose estimation, and more specifically, 3D pose estimation
- I got to develop my own algorithm for computing angular differences between instructor and student, which was a fun little review of linear algebra
What's next for Yoga Pose
- As of right now, this is not a real-time application, since running the pose estimation model client-side is impractical. Even if the pose estimation model ran at 30FPS and I were to run it on a server, network latency would cause too much lag for it to be real-time for the time being, so potentially using a more light-weight model for an actual application would be useful.
- I didn't take lag into consideration for my algorithm (the student never follows the instructor exactly, so I need to take into consideration that it will take the student a few seconds to get into a pose).
- Anomaly removal is also important. For example, if the instructor needs to brush hair out of their face or accidentally messes up themselves, the algorithm shouldn't penalize the student.
- Segmenting the video by yoga sequences would make for a better UX