Inspiration
We were inspired by the paradigm shift offered by the new boundaries of spatial computing in the Apple Vision Pro. As AI and ML enthusiasts, we wanted to further explore the use of the Apple Vision Pro with more readily available consumer technologies, such as iPhones and 360-degree cameras. These devices already have a vast amount of pre-existing footage and sources of media, yet their integration with the Vision Pro was lacking.
Through our use of the Vision Pro, we found that there was no simple way to view 3D models, particularly those of everyday objects that could be easily captured with a phone. Additionally, the 360-degree immersive experience felt underwhelming, as most existing applications relied on embedding YouTube or other conventional platforms. This reliance made custom workflows—such as running object detection and other ML models on 360-degree media—nearly impossible.
What it does
Vision360 enables users to scan objects using their iPhones, process them with photogrammetry on a Mac, and then view and share them in an immersive 3D space on Apple Vision Pro. Additionally, Vision360 allows live-streaming of 360-degree videos with real-time AI-powered object detection, revolutionizing how users interact with spatial computing, all due to the first-ever custom protocol-pipeline for sharing immersive 360-degree video to the Apple Vision Pro.
How we built it
We developed native applications for iOS, macOS, and visionOS, leveraging Apple's powerful toolkits. Our pipeline was built to ensure seamless integration between devices, emphasizing quality and low-latency transmission. AI-powered object detection was implemented using YOLO models, with computationally expensive processing offloaded to HPC clusters and the cloud for real-time analysis.
Challenges we ran into
Like any cutting-edge project, we encountered several challenges along the way:
Limited Development Support: Due to visionOS being a new platform, documentation and community support were scarce, which made many technical decisions difficult.
Network Performance Uncertainty: We had to use unconventional technologies for our protocol due to concerns over WiFi performance and its impact on latency.
Apple's Privacy-Focused Development Approach: Apple's strict privacy policies required us to find creative workarounds for media and file importing.
Accessing Raw Camera Footage: Working with external cameras, such as Insta360, posed challenges in retrieving raw footage for processing and analysis.
Accomplishments that we're proud of
Successfully developing a cross-platform pipeline that enables seamless 3D scanning and streaming.
Implementing real-time AI object detection in immersive 360-degree video.
Overcoming visionOS development hurdles and optimizing performance for low-latency streaming.
Creating the first-ever custom protocol-pipeline for live stream 360-degree immersive video from a custom source to the Apple Vision Pro.
What we learned
Development for Apple's Ecosystem: We learned how to develop native applications for various Apple OS platforms, including iOS, macOS, and visionOS.
Building for a 3D Paradigm: Developing applications in a 3D environment required us to rethink our approach to UI, UX, and data representation.
Accelerating Workflows with AI: We explored how AI tools and Apple's toolkits can optimize development and processing pipelines.
Protocol Development: Understanding how different technologies impact protocol development was crucial, especially since we prioritized quality and latency while also running a YOLO model for real-time object detection.
Offloading AI Processing: We learned how to offload computationally intensive AI processes to HPC clusters and the cloud, allowing for real-time AI enhancements without compromising performance.
What's next for Vision360
More AI-powered features like automated 3D model enhancement and advanced scene understanding.
Support for additional platforms to make our technology accessible to more devices.
Improved network optimizations to further reduce latency for real-time immersive experiences.
Integration with AR/VR applications beyond Vision Pro, allowing for broader adoption and new creative possibilities.
Log in or sign up for Devpost to join the conversation.