🌌 Surroundly

Transform YouTube into an immersive VR experience


💡 Inspiration

Watching YouTube in VR today means sitting in a void—a black emptiness with a floating screen. You're isolated in nothingness, watching content that depicts rich, vibrant worlds. Travel vlogs from Bali, music videos in neon-lit cities, documentaries about the cosmos... all viewed from an empty abyss.

We thought: what if your environment could match what you're watching?

Instead of watching a video about a beach from a void, what if you could feel like you're at the beach? Surroundly was born from this vision—using AI to transform passive viewing into spatial presence, filling that void with worlds that respond to your content.


🎯 What It Does

Surroundly transforms YouTube into an immersive VR experience for Meta Quest. When you browse to a video:

  • 🔍 AI analyzes the content — Our backend processes the video to understand its theme, setting, and key objects
  • 🌅 A 360° skybox is generated — An AI-generated panoramic environment surrounds you, matching the video's mood and context
  • 🎸 3D objects spawn around you — Relevant interactive objects appear with satisfying animations and spatial audio
  • You can grab and explore — Pick up and inspect the spawned objects while you watch

The experience starts with an animated procedural starfield, then smoothly transitions into your personalized environment as AI generation completes.

It's YouTube, but you're inside it.


🛠️ How We Built It

Frontend (Meta Quest App)

  • Built natively with Meta Spatial SDK on Android/Kotlin
  • Custom GLSL shaders for the animated starfield and 360° panorama rendering with smooth dissolve transitions
  • WebView panel integration for YouTube browsing with URL change detection

Backend (AI Pipeline)

  • RESTful API that accepts YouTube URLs and returns generated assets
  • Video analysis pipeline powered by Llama 4 Scout, which extracts visual themes, objects, and context with high-level semantic understanding
  • AI-powered 360° skybox generation
  • 3D model selection/generation based on video content

✨ Key Technical Features

  • Procedural starfield shader with multi-layer parallax and twinkling
  • Directional dissolve transitions that open from the viewer's gaze direction
  • Dynamic model spawning with bounce animations and physics
  • Automatic recentering of UI and objects when user repositions

🚧 Challenges We Ran Into

  • Shader transitions — Getting smooth, directional dissolves between skyboxes required careful math to blend based on world-space normals and viewer direction
  • Model validation — AI-generated 3D models sometimes had invalid bounds (infinity values) or zero-size dimensions; we built robust validation to gracefully skip broken models
  • Async coordination — Balancing polling frequency for job status while avoiding API spam, handling user navigation mid-generation, and cancelling stale jobs

🏆 Accomplishments We're Proud Of

  • The "wow" moment: Seeing a starfield dissolve into an AI-generated beach while watching a surf video is genuinely magical
  • 🔄 It actually works end-to-end: From YouTube URL detection to AI generation to immersive rendering, the full pipeline delivers
  • 🌍 One of the most surprising things: Surroundly works with every YouTube video. No matter what you pick, the system analyzes it and generates a matching environment. That moment when we realized any link works was a huge wow moment.

📚 What We Learned

  • Meta Spatial SDK is powerful but requires thinking spatially—positioning UI relative to viewer pose, and managing scene object lifecycles
  • Custom shaders unlock unique visual experiences but require careful optimization for mobile VR
  • AI-generated content needs validation layers; you can't trust every output
  • Small details matter in VR — the spawn sound, the bounce animation, the dissolve direction—they create presence

🚀 What's Next for Surroundly

  • 📺 Expanded content sources — Support for Facebook Video, Twitch, Vimeo, and local video files
  • 💾 Persistent environments — Save favorite generated environments to revisit later
  • 👥 Social viewing — Watch with friends in shared AI-generated spaces
  • 🧠 Deeper AI integration — Real-time environment updates as video content changes (scene detection)
  • 🎨 Creator tools — Let YouTubers define custom environments for their content
  • Skyboxes for shorter videos — For the hackathon, skyboxes are generated only for videos longer than 5 minutes. Shorter videos currently do not receive a skybox, though 3D models always load regardless of duration.

Built With

Share this project:

Updates