Spatial Study

Studying at the Beach
Immersive Environment
3D Models Curated for video
Video Watching

Inspiration

Drawing inspiration from the informative videos of Khan Academy and Crash Course, we aspired to elevate the learning experience by making it immersive and interactive. With Next-Gen, students don't just watch dry educational content; they engage with it, manipulating virtual objects—like rubber balls in a physics lesson—to grasp complex concepts such as momentum in a tangible way.

What it does

Next-Gen bridges the gap between static video content and dynamic learning by transforming passive viewing into an active, immersive educational experience. The application syncs with educational YouTube videos, overlaying real-time 3D models and environments that correspond with the video's content. As students watch a video about, for example, the laws of motion, Next-Gen places them inside a customized virtual reality where they can interact with 3D objects like planets or machines, directly applying and testing theories as they learn them.

How we built it

We first had to think about an idea that targeted one of the four verticals. Learning stood out to us immediately as we also were learning about the new, innovative world of visionOS. After cementing our idea, we drew a figure with the expected flow of the project and delegated tasks to each member. Using a speech-to-text generator, Tactiq, we would receive the transcript of a YouTube video by giving Tactiq the URL. We would then give the transcript to an LLM, ChatGPT 4, to return two things:

A prompt for a Skybox based on a proper environment of the video
A set of prompts for 3D models at various points throughout the video

Given these two, we would give each prompt to a text-to-3D AI: for Skybox: Blockade Labs; for 3D models: LumaAI. These two would generate a Skybox, fitting the setting of the YouTube video and specific 3D models that correlate to what the video is talking about at a certain timestamp. For the Skybox, there is a button that can toggle on or off the immersive environment.

Challenges we ran into

There were many highs and lows throughout the 24 hours, but one of the most challenging things for us was that one of the APIs we heavily relied on for text to 3D models suddenly crashed for the entire night. This issue forced us to reconsider the approach of our project and how to proceed. Another challenge was learning in a compressed time frame. For 3 of us, this was our first hackathon, and the curve to bring what we learned in class to real-world implementation was incredibly steep. The hackathon was some of our first times getting involved in API, Git, working with other coders, and more. However, learning was one of the key aspects we approached this hackathon with, and we all learned a lot about visionOS and how to bring code to applications.

Accomplishments that we're proud of

Our proudest accomplishment in developing Next-Gen is our rigorous research and robust problem-solving capability. We tackled complex technical challenges, from intricate API integrations to ensuring seamless content synchronization. Our research-driven approach led us to innovative solutions, leveraging the collective knowledge of tech communities and the latest software development practices.

What we learned

Our venture into the Next-Gen project was marked by the strategic integration of the YouTube API v3, which empowered us to synchronize 3D models with video timestamps flawlessly, enhancing the interactive learning journey. The mastery of real-time data manipulation ensured a synchronized educational experience that was both engaging and precise.

We also harnessed the Skybox Blockade API to instantaneously generate thematic environment images, skillfully intertwining AI's potential with our virtual reality landscape. This not only demonstrated our ability to handle complex API interactions but also showcased our adaptability in applying AI-generated content to enhance real-time user experiences.

Additionally, the inclusion of LumaAI elevated our project by transforming scripts into vivid 3D models, enriching the visual learning component. We then successfully got AI-generated prompts from the video captions to be programmatically timed to emerge in sync with our video content.

This fusion of APIs and AR technology not only sharpened our technical skills but also showcased the transformative potential of combining multiple cutting-edge tools to revolutionize the educational experience.

What's next for Next, Gen

Looking forward, we aim to expand our content library to cover a broader range of subjects and educational levels. We are working on incorporating adaptive learning algorithms to tailor the VR experience to individual learners' needs. Additionally, we plan to introduce multiplayer functionality, allowing for collaborative learning experiences in virtual environments.