Worldsplat: Text to Stylized 3D Worlds in Real-Time
Inspiration
Imagine you're an art director. You have a vision, like a moody cyberpunk alley, rain-slicked streets, neon bleeding through fog. But to see if it actually works? You wait weeks. Commission concept art. Render test footage. Spend thousands of dollars on pre-viz. And if it's wrong? Start over.
Art direction in games is decided blind. I built Worldspla as an autonomous agent that generates and stylizes 3D worlds from natural language, so you can see your vision instantly.
How We Built It
Worldsplat chains three AI services through a WebSocket orchestration layer:
- World Labs generates 3D Gaussian splats from text prompts
- Splat viewer renders the splat and captures frames at ~15 FPS
- Decart Mirage stylizes frames in real-time (~47ms per frame)
A Python backend (aiohttp + websockets) coordinates the pipeline, while Gemini Flash powers natural language input.
Challenges
WebRTC on a private network was our biggest hurdle. Decart requires WebRTC for real-time frame streaming, but the hackathon's private network made peer-to-peer connections a nightmare. Getting STUN/TURN servers configured and NAT traversal working ate up significant debugging time.
What We Learned
Gaussian splats are a game-changer for real-time 3D—photorealistic quality with lightweight rendering. And never underestimate networking issues when building on private infrastructure.
The alakazam_server with branch worldlabs_bridge is necessary for it to work https://github.com/alakazam-gg/alakazam-server/tree/worldabs_brige Built with World Labs, Decart, and Gemini.
Built With
- decart
- python
- worldlabs
Log in or sign up for Devpost to join the conversation.