About the Project
Inspiration
I wanted to explore a more natural way to create 3D content — turning voice prompts into 3D models you can immediately explore in VR/AR. I imagined a system where you could speak your idea, see it in immersive space, and eventually 3D print it.
What I Learned
This project taught me a lot about integrating AI, cloud infrastructure, and immersive VR:
- How to deploy a GPU-powered Cloud Run service capable of generating 3D models in real time.
- How to use AI Studio to scaffold backend services, deployment tooling, authentication, and infrastructure setup.
- How to integrate Immersive Web SDK in a Meta Quest VR environment for live AR/VR visualization.
- Handling real-time 3D asset loading and converting between formats (GLB → STL) for potential 3D printing.
How I Built It
The project has two main components:
Backend (Cloud Run)
- A GPU-enabled container running a Shap-E model generates .glb 3D models from text prompts.
- Models are converted to STL using
trimeshfor potential 3D printing using python - Google cloud storage buckets host the models
- A google Run based Auth token API instance ensures secure access.
- Google speech to text API handles voice prompt transcription
- All infrastructure, deployment, and local dev environment were guided by AI Studio.
Frontend (VR/AR)
- Built with the Immersive Web SDK, deployed to Firebase.
- Users enter VR, speak a prompt, and the 3D model appears in their space.
- Models can be moved and rotated using hands or controllers for inspection.
The pipeline looks like this: Voice input → Cloud Run (Shap-E GLB generation) → STL conversion → VR/AR display
Challenges
- 3D model quality: Early models are low-resolution with a focus on speed. Balancing fidelity and real time UX is still a challenge.
- Time constraints: Integrating the printing pipeline wasn’t feasible before the deadline.
- VR debugging: Testing immersive interactions remotely is tricky; required careful local network setup to connect the Quest to the Cloud Run service.
Next Steps
- Improve 3D model fidelity and generation speed.
- Complete direct 3D printing integration. For this I'd use another cloud run instance to host open source slicing tools.
- Expand multi-object scenes and more natural spatial interactions in VR.
Even as a proof-of-concept, this project demonstrates a novel workflow from voice → 3D → immersive exploration, powered by AI and cloud infrastructure.
Built With
- docker
- fast-api
- firebase
- google-cloud
- google-cloud-run
- google-speech-to-text-api
- iwsdk
- python
- shap-e
- typescript
Log in or sign up for Devpost to join the conversation.