Inspiration

Education has always struggled to make history and science feel real. Reading about life in an Ancient Chinese Village or how photosynthesis works in a textbook is passive. You're told what happened/happens, not shown it. We were inspired by the idea that immersive 3D environments could transform learning from memorization into genuine exploration, and that an AI guide embedded inside those worlds could answer questions the way a knowledgeable human standing next to you would.

What it does

Atlas lets you type any historical event or scientific concept and instantly step inside a generated 3D world built around it. Once inside, an AI Guide speaks and responds to your questions in real time, grounding every answer in the specific scene you're exploring. You can click on individual objects to focus on them, ask follow-up questions, and for STEM scenes, interact with guided experiments like a photosynthesis cycle that responds to your actions.

How we built it

Atlas is built on a Next.js frontend with a Node/Express backend. On the client, we render worlds with Spark + Three.js for real-time navigation, interaction, and HUD overlays. We use World Labs’ Marble model to generate immersive 3D Gaussian Splat worlds from text prompts, Google Gemini to power the scene interpretation and conversational Historical Guide, and ElevenLabs to synthesize the guide’s spoken responses. The scene graph system structures each world into labeled, queryable elements that give the AI precise spatial and historical context for every answer.

Challenges we ran into

Getting the AI guide to stay grounded in the scene, rather than hallucinating people or objects that weren't there, required careful prompt engineering around the scene graph. Streaming 3D Gaussian Splat assets reliably while keeping the chat experience responsive was technically tricky, as world generation can take several minutes. In addition, the world generation can remain very blurry even with user's images, which can make the world building limiting. Integrating ElevenLabs TTS in a single chat round-trip without blocking the text response also required separating the audio pipeline carefully.

Accomplishments that we're proud of

We're proud that the full loop works end-to-end: you type a prompt, a real 3D world generates around it, and a voice-enabled AI guide answers your questions from inside that world with historically accurate detail. The STEM experiment mode felt very innovative to teach science through interaction rather than instruction.

What we learned

Building Atlas, we looked deep into Gaussian Splatting, which that enabled us to create photorealistic worlds. Integrating Spark, World Labs' viewer for Gaussian Splat scenes, taught us how to embed and control splat rendering inside a web environment, and how the underlying .spz asset format packages splat data for streaming delivery. On the 3D web side, working with Three.js sharpened our understanding of scene graphs, camera controls, and how to layer interactive UI elements on top of a live 3D canvas without fighting the renderer.

What's next for Atlas

We want to improve the graphics of the 3D world generation. Even with Gaussian Splat creating a realistic world to explore with one image, it can be more meaningful to create a larger world. In addition, we want to integrate VR compatibility so that users can have a more immersive experience and utilize body motion to interact with the world easily.

Built With

Share this project:

Updates