About the Project

Inspiration

I wanted to explore a more natural way to create 3D content — turning voice prompts into 3D models you can immediately explore in VR/AR. I imagined a system where you could speak your idea, see it in immersive space, and eventually 3D print it.

What I Learned

This project taught me a lot about integrating AI, cloud infrastructure, and immersive VR:

How to deploy a GPU-powered Cloud Run service capable of generating 3D models in real time.
How to use AI Studio to scaffold backend services, deployment tooling, authentication, and infrastructure setup.
How to integrate Immersive Web SDK in a Meta Quest VR environment for live AR/VR visualization.
Handling real-time 3D asset loading and converting between formats (GLB → STL) for potential 3D printing.

How I Built It

The project has two main components:

Backend (Cloud Run)
- A GPU-enabled container running a Shap-E model generates .glb 3D models from text prompts.
- Models are converted to STL using trimesh for potential 3D printing using python
- Google cloud storage buckets host the models
- A google Run based Auth token API instance ensures secure access.
- Google speech to text API handles voice prompt transcription
- All infrastructure, deployment, and local dev environment were guided by AI Studio.
Frontend (VR/AR)
- Built with the Immersive Web SDK, deployed to Firebase.
- Users enter VR, speak a prompt, and the 3D model appears in their space.
- Models can be moved and rotated using hands or controllers for inspection.

The pipeline looks like this: Voice input → Cloud Run (Shap-E GLB generation) → STL conversion → VR/AR display

Challenges

3D model quality: Early models are low-resolution with a focus on speed. Balancing fidelity and real time UX is still a challenge.
Time constraints: Integrating the printing pipeline wasn’t feasible before the deadline.
VR debugging: Testing immersive interactions remotely is tricky; required careful local network setup to connect the Quest to the Cloud Run service.

Next Steps

Improve 3D model fidelity and generation speed.
Complete direct 3D printing integration. For this I'd use another cloud run instance to host open source slicing tools.
Expand multi-object scenes and more natural spatial interactions in VR.

Even as a proof-of-concept, this project demonstrates a novel workflow from voice → 3D → immersive exploration, powered by AI and cloud infrastructure.

Built With

docker
fast-api
firebase
google-cloud
google-cloud-run
google-speech-to-text-api
iwsdk
python
shap-e
typescript

Updates

Tom Nason started this project — Nov 10, 2025 07:55 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.