Due to COVID19, people across the world are unable to exercise with their friends and/or trainer, whether it be a fitness coach or martial arts instructor. They are stuck watching and trying to follow videos inside the confines of their homes. This lacks any sort of feedback mechanism, something that is absolutely critical to growth. To solve this, we use computer vision for 3D animation to have an avatar that gives responsive feedback in a video call.

The idea originally came as one of our teammates’ fathers wanted to be able to exercise with his personal trainer, but was not able to due to COVID-19. We thought about replicating the trainers’ movement into a realistic 3D model that would make it easy for anyone to follow along.

However, after reflecting, we are really excited as we noticed the potential domain for this application is endless. Since we can use the information for an in-depth analysis of a person’s movements. Lastly, this software can be used for anything that requires movement: physical fitness, yoga, sports analysis, walking technique correction, dancing etc.

What it does

Cutting to the chase, our application does 4 things:

  • Uses PoseNet, our custom computer vision algorithm, and the Babylon.js library to transfer a user’s 2D pose to a 3D avatar in real-time.
  • The 3D Avatar can be viewed from multiple point-of-views without affecting the avatar’s pose.
  • Follow a virtual Mentor to learn new 3D poses by taking online lessons. Our pose comparison system ensures that the user has indeed learned the pose.
  • Take a live session online with a real instructor.
  • Get real-time feedback on your poses automatically (coming soon).

How I built it

We have four components:

  1. A video chat system and website. We modified and decked out AppRTC with Groove's brand, style, and components.

  2. We created an app that uses PoseNet on the browser using tensorflow.js. We did painstaking hyperparameter optimization to maximize accuracy and performance.

  3. We needed to map 2D joints from PoseNet to joints in the 3D skeleton. This was a challenging mathematical aspect. We needed to use quaternions and angles in 3D since the axes of rotation were a problem. Since tensorflow.js and babylon.js are both JS libraries, the data transfer between the two libraries was straightforward.

  4. We compare the 3D avatar poses to give real-time feedback. This was an algorithm where we compare the joint angles between a user’s and the instructor’s avatar. If the difference is within a breathing area, then we say that the user has successfully learned the pose.

Challenges I ran into

  • Transferring 2D poses correctly into angles in 3D was a major challenge. We had to look at the local axes for each joint in the 3D skeleton and map the angle accordingly.
  • Working with Web RTC.
  • Combining the two incompatible libraries we used was very difficult; one of our team members had to rewrite one of them to not use React. The build process in order to combine them was also quite tedious.

Accomplishments that I'm proud of

We are most proud of our breakthrough in this field: while past implementations of this are limited to a 2D interpretation, we used past technology to create something that gave real-time 3D animation! Secondly, almost none of us had experience with the Babylon.js library or PoseNet, so we had to learn how to use those, as well as WebRTC which none of us had experience with. Thirdly, the library we used to help with 3D pose recognition used React, which was incompatible with the Python library for HTML templating that the library we used for WebRTC calling relied upon, so we had to completely rewrite the 3D pose recognition library to not use React. Being able to overcome these challenges to create something that has an application and is novel is something we are really proud of!

What I learned

Below, please find a brief description of what each team member learned from the experience.

Adam - Throughout the process, I learned a plethora of technical skills. Primarily, I learned how to use the Windows Subsystem for Linux, Parcel, Google App Engine, Twilio (for STUN servers), yarn, and pip. I was also reminded of the importance of a proper automated build process, and of terminal commands.

David - At this hackathon, I learned three invaluable things. First, I learned how to deploy an application to Google Cloud. It was really cool; I hope to use cloud computing at future hackathons! I also learned how to build and deploy a Go collider server, and I dabbled in WebRTC and lots of other trendy technologies. But most most important of all, I developed my integration and leadership skills. The process of turning our idea into a finished project, distributing the workload, and integrating everything was very challenging, but very rewarding, and I am happy that I persevered.

Siddhant - While I knew about the PoseNet resource for 2D movement recognition, in our project we tried to implement this movement in a 3D space. As a part of the project, I learned how to transform angles from 2D into 3D space. While a daunting task, I am happy that we were able to accomplish it in the end!

Keshav - I learned how to implement the Babylon js library in order for the 3D character and its skeleton to move as well. Additionally, I used CSS for the first time to help with the stylizing of the HTML pages.

Peter - I learned HTML and CSS and implemented them. Traditionally, I used inline CSS but this time I used separate stylesheet. Second, I learned how to incorporate the designs and logos I made into the workflow.

What's next for Groove

The four categories we want to improve things are in the following:

  • Our feedback system can incorporate joint level highlights to pin-point exactly what it is that the user is doing incorrectly while following a pose.
  • An AI voice assistant that gives instructions to the user automatically, so he/she does not have to rely on feedback by the physical instructor.
  • Our interface can use a better layout where the focus is on the 3D avatars and not the live video stream of the instructor.
  • Optimization so our platform requires less computational performance to give the user an amazing experience.

Additional Notes

At you can experience a Virtual Guru yoga tutoring session which forms the basis for our system to give users automated feedback. This feature is still in development and is hosted separately for the time being, but we are working on integrating those features into the main site.

Share this project: