What it does

SpatialU was envisioned as a cross-platform tool for learning skills by literally walking in an expert's shoes. By combining volumetric video (courtesy of Soar) and a mixed reality approach to first-person learning, a user can watch a 3D model of an action being performed, while also seeing themselves via VR passthrough camera. While the tool currently only supports teaching a choreographed dance, any skill that involves physical motion could be incorporated.

Have a look at a full walkthrough of the prototype here: https://www.youtube.com/watch?v=gFKQh62WzIE


We started by trying to teach people skills with embodied spatial interaction. Currently people can learn skills or dances through paper instruction, YouTube, or TikTok videos. All of these require abstraction from 2D to 3D, changing a flat image or video into the 3D world that they interact in. This is inherently hard when learning since, by definition of learning, people don’t know what to do yet. What if we created an easier way to do this? Our starting point was volume capture videos, that users could see in 3D in front of them, as well as overlaid on top of them. Being overlaid on top of them as well as seeing with depth in front of them should clearly show people what to do. We thought about different task categories that would work well for this. We thought about four main categories

  1. Dance (YMCA, Burning Man Dance, Waltz)
  2. Exercise (Yoga poses, Squat, Boxing techniques)
  3. Sports (Baseball Swing, Golf - driving and putting, Tennis Serve)
  4. Handyman (Woodworking Joinery - dovetail, Change a bike tire) We then tried to get motion capture videos and use MRTK on the Hololens 2 as our starting point. This is where we began to run into issues.

How we built it

This experiment started with access to volumetric video, through Soar's Kinect capture rig. Using Unity's Oculus VR Manager, we combine a user's sense of their surroundings with visuals of the action to perform with UI messaging to create a complete learning model.

Challenges we ran into

The MRTK did not want to work with our Unity 2020.3.32 version, which we later learned was not compatible. After some version testing, we pivoted away from the MRTK and used the Oculus Integration manager, which was very effective. However, effectively closed the door on Hololens 2 development. Also, the Oculus Integration’s passthrough didn’t seem to be as high fidelity as the MRTK’s output, so that was an area that we would reconsider in a future version with more time.

Motion Capture was also initially hard to get done. The motion capture team Soar had hardware calibration issues so the videos, while in 3D, were pretty rough looking. They had an awesome API for optimizing the huge textures, but we were told this did not currently work in DirectX, which is needed for our Oculus 2 integration, so we had to use large OBJ & PNG files. Ten seconds of animated video ended up being about 3GB, so to reduce performance costs we reduced the quality of the textures with a batch image dimension reduction, and reduced the size by approximately ⅓. Also, in an effort to also capture the shoes, the floor was included in the model, but this further reduced the size available for model textures, so further work with Soar on model exporting would be helpful.

Accomplishments that we're proud of

We managed to get Spatial Video working in VR! We took the super large videos and cut them into bite size steps, better for learning and to deal with tech limits. We created a great logo and a beautiful scene with a coherent look, designing posters with the dance poses and steps.

What we learned

We had the pleasure of working extensively with Soar Streaming to capture Athena Demos and her super cool flash-mob dance :) The methods for video capture, along with the process for exporting the models and understanding the compression methods was certainly a journey, but with Max as a guide it was a fun one. We also were able to experiment with the Oculus Quest 2's passthrough functionality and learn about the pros and cons of the system. While it lets users feel much more in control in VR, the quality was noticeably not terrific, breaking immersion somewhat.

What's next for Spatial U

We feel SpatialU has demonstrated the concept of immersive first-person learning, however we have only scratched the surface of applications and tools that could help students across the world, in remote areas and metropolitan cities, to connect and expand their skills in a more personal way than ever before. The current dancing application relies on human movement, but it doesn't include any foreign objects like hand tools or paint brushes, and the hand and voice controls could be much more robust. We look forward to expanding the capabilities of this application as the capabilities of the system progress.

Built With

Share this project: