🌌 Surroundly

Transform YouTube into an immersive VR experience

💡 Inspiration

Watching YouTube in VR today means sitting in a void—a black emptiness with a floating screen. You're isolated in nothingness, watching content that depicts rich, vibrant worlds. Travel vlogs from Bali, music videos in neon-lit cities, documentaries about the cosmos... all viewed from an empty abyss.

We thought: what if your environment could match what you're watching?

Instead of watching a video about a beach from a void, what if you could feel like you're at the beach? Surroundly was born from this vision—using AI to transform passive viewing into spatial presence, filling that void with worlds that respond to your content.

🎯 What It Does

Surroundly transforms YouTube into an immersive VR experience for Meta Quest. When you browse to a video:

🔍 AI analyzes the content — Our backend processes the video to understand its theme, setting, and key objects
🌅 A 360° skybox is generated — An AI-generated panoramic environment surrounds you, matching the video's mood and context
🎸 3D objects spawn around you — Relevant interactive objects appear with satisfying animations and spatial audio
✋ You can grab and explore — Pick up and inspect the spawned objects while you watch

The experience starts with an animated procedural starfield, then smoothly transitions into your personalized environment as AI generation completes.

It's YouTube, but you're inside it.

🛠️ How We Built It

Frontend (Meta Quest App)

Built natively with Meta Spatial SDK on Android/Kotlin
Custom GLSL shaders for the animated starfield and 360° panorama rendering with smooth dissolve transitions
WebView panel integration for YouTube browsing with URL change detection

Backend (AI Pipeline)

RESTful API that accepts YouTube URLs and returns generated assets
Video analysis pipeline powered by Llama 4 Scout, which extracts visual themes, objects, and context with high-level semantic understanding
AI-powered 360° skybox generation
3D model selection/generation based on video content

✨ Key Technical Features

Procedural starfield shader with multi-layer parallax and twinkling
Directional dissolve transitions that open from the viewer's gaze direction
Dynamic model spawning with bounce animations and physics
Automatic recentering of UI and objects when user repositions

🚧 Challenges We Ran Into

Shader transitions — Getting smooth, directional dissolves between skyboxes required careful math to blend based on world-space normals and viewer direction
Model validation — AI-generated 3D models sometimes had invalid bounds (infinity values) or zero-size dimensions; we built robust validation to gracefully skip broken models
Async coordination — Balancing polling frequency for job status while avoiding API spam, handling user navigation mid-generation, and cancelling stale jobs

🏆 Accomplishments We're Proud Of

✨ The "wow" moment: Seeing a starfield dissolve into an AI-generated beach while watching a surf video is genuinely magical
🔄 It actually works end-to-end: From YouTube URL detection to AI generation to immersive rendering, the full pipeline delivers
🌍 One of the most surprising things: Surroundly works with every YouTube video. No matter what you pick, the system analyzes it and generates a matching environment. That moment when we realized any link works was a huge wow moment.

📚 What We Learned

Meta Spatial SDK is powerful but requires thinking spatially—positioning UI relative to viewer pose, and managing scene object lifecycles
Custom shaders unlock unique visual experiences but require careful optimization for mobile VR
AI-generated content needs validation layers; you can't trust every output
Small details matter in VR — the spawn sound, the bounce animation, the dissolve direction—they create presence

🚀 What's Next for Surroundly

📺 Expanded content sources — Support for Facebook Video, Twitch, Vimeo, and local video files
💾 Persistent environments — Save favorite generated environments to revisit later
👥 Social viewing — Watch with friends in shared AI-generated spaces
🧠 Deeper AI integration — Real-time environment updates as video content changes (scene detection)
🎨 Creator tools — Let YouTubers define custom environments for their content
⏳ Skyboxes for shorter videos — For the hackathon, skyboxes are generated only for videos longer than 5 minutes. Shorter videos currently do not receive a skybox, though 3D models always load regardless of duration.