Inspiration

The invention of the internet brought a human catalog of memory that is freely accessible. Machine learning models that utilize stable diffusion manage to illustrate the internet's collective memory into images. We wanted to bring this novel tech to Virtual Reality, the most immersive interface between the senses and the digital.

What it does

We insert a prompt to a "Text-to-3D" stable diffusion model (ex: prompt "a green prickly cactus") which automatically gets added to a Unity project which builds to a Quest 2 headset. We offer two modes: an art gallery that brings focus to the generated 3D model, and a planetary mode that allows the player to walk on the surface of these enlarged 3D models.

How we built it

Using this implementation of text-to-3D stable diffusion with Google Colab as compute power, we send it to a Google Drive which is imported to our Unity project via the file system at runtime. The Unity project has complete VR functionality which we added using the XR Interaction Toolkit package. In the Unity environment, we place the generated 3D models into a virtualized space, creating an art gallery and a separate planetary mode, enlarge the generated model with a basic physics system to walk along a planetary surface.

Challenges we ran into

Google Drive API work is very hard and we had to resort to downloading some things manually. The stable diffusion model we use is also very specific where it MUST be described as an object; for example, "a piano with legs". The inserted prompt cannot be an emotion such as "happiness" or a setting such as "forest". Additionally, we ran into several issues with our intended import flow into Unity, but we were able to get around this using an open-source async mesh loader for Unity called AsImpL.

Accomplishments that we're proud of

We're especially proud of managing to create an entire build flow, importing results from Google Colab into Unity at runtime, something not easily supported by Unity by default. We're also proud of our dedication, working through wi-fi issues and through the night to create something we're happy with as our first hackathon project.

What we learned

This is our first foray into using ML-based art and Google Colab. We learned advanced git functionalities to support proactive and collaborative teamwork for a rapidly iterating project. We also learned how to properly navigate around the various Github repositories and APIs to properly choose what would be best for the purposes of our project.

What's next for Dreamfusion VR

We would like to fully automate with Google Cloud APIs to allow for seamless importing through the cloud. We would also like to explore procedural generation using AI generated models, moving beyond viewing one model at a time to exploring an expansive dream-like world. A small thing we got really close to but didn't make the final cut was displaying the text prompt that formed the model in the art gallery; currently, we display the file name. Finally, we would like to find ways to increase compute power to achieve even more fully-formed models.

Built With

Share this project:

Updates