SimpleYogi

Using yoga and AI to elevate human consciousness. Are you ready to be a poser?
An example of the pose Anjaneyasana visualized after applying pose estimation

Inspiration

Buried under stress, and trying to get through school, work or any number of responsibilities can stress out anyone, including the three of us. Millienials are known as the "Burnout Generation," though this is a far wider-stretching problem, with as many as "one million people per day miss[ing] work because of stress". With this issue in front of us, we wanted to see how to tackle the crisis by letting classical approaches lead us to new enlightenment.

Yoga has been known for generations to be a stress reliever, reducing anxiety and cortisol levels. It also is meant to raise to consciousness of the partaker, by giving them time to focus on themselves. By leveraging PoseNet's Convolutional Neural Network and classical KNN machine learning approaches, we endeavour to make yoga more accessible. We let by letting individuals be able to correct in the comfort and privacy of their own home, and without the financial barrier to entry (especially useful for students).

What it does

This alpha release of the product works by showing users the selected pose and correcting their form in real time with an on-screen arrow. It runs within the browser needing no additional hardware. Just a laptop and you're all set!

The beta release further aims to fit into the user lifestyle. With an on-the-go routine, being able to follow through a video/set routine and have the software automatically detect your poses and offer real-time correction could help make the SimpleYogi a simple start to any go-getter day.

See for yourself!

How we built it

There are 4 main components to our system:

Data & Model - We work with PoseNet as our pose estimation backend, using its APIs to receive the keypoint outputs. Once a dataset was initially found, all of the images had to be groomed (ex. to have the same orientation and user angle), and subclassified into being partial images, diagrams, full images of the people.
Clustering - These keypoints are then fed into a scikit-learn NearestCentroid algorithm. The benefits of using a clustering algorithms are that as we augment our data, it will be easier to map different clusters to a single class (ex. different orientations of a pose), which is better for the long-term growth of the product. It uses a custom cost function based on Google MoveMirror's weighted cosine similarity function for grouping the clusters.
Correction - Once our model is trained and set, we then saved the weighted model. Whenever needed, that model can then be used to parse a fresh set of keypoints and output the estimated class and top three points to be corrected, weighted by both the amount of error and the PoseNet confidence.
Frontend - Using Javascript and Flask to run the entire system out of the browser for seamless, accessible dissemination. Currently it runs only on localhost for demo purposes.

Challenges we ran into

Processing the PoseNet model - we are all fairly unfamiliar with Javascript, and despite our best efforts, could not get PoseNet to run purely through python. Learning to work with the Javascript interface, particularly for processing real time video, was a big learning curve that we are excited to have tackled.
Time constraints / Pivoting in Data and Model processing. Before PoseNet we tried 2 different models from the OpenPose library, but to very poor results on the data. Each took some time to set up (not a single standard workflow). In parallel we were looking for and grooming images to feed the system. We had also attempted to set up the system to run on a GPU (both on a laptop and on a Google Cloud instance) though this did not end up working out. With all of these time setbacks, it was concerning as to whether we would finish a demo at all

Accomplishments that we're proud of

Being able to reliably communicate with Tensorflow in real-time manner
The frontend development. None of us have too much experience with that space, so developing a frontend able to support the client side video capture, was very rewarding.
Getting a model trained to the accuracy that we did. After the first few failures with the OpenPose experimentation, getting PoseNet to eventually finally work was a big breakthrough
For Felix and Sanjeevani, this is their first proper machine learning project and getting anything working was quite exciting!

What we learned

How to use Flask to run a backend through python
How to stream and interact with video in JavaScript
How to interface with PoseNet API in Javascript
Using Scikit-Learn's clustering algorithms library
How much of machine learning comes down to data processing both before and after it goes through the pipeline
Big Takeway: We all want to become less specialized and more familiar with the web development aspects since all AI systems would have a human interaction component. And with models like PoseNet exclusively running through JavaScript, a healthy understanding of these frameworks would rapidly speed up development next time.

What's next for SimpleYogi

Product plan in a Nutshell

Increasing the breadth of poses available (more classes, with greater accuracy)
Gathering and augmenting the data to allow for more accuracy for automatic pose classification (ex. rotation, flipping images)
Use automated real-time pose classification to generate improvement reports/plans for users
Partnerships with known experts in the field to better validate our legitimacy and accuracy
Progress Plan + Datafication - allowing users to track their progress in improving their forms over time. Can be used to implement reward systems
Roll out greater features for tracking specific yoga routines, daily yoga reminders, integration with other fitness apps (ex. Fitbit for fitness tracking during a routine), etc.