Inspiration

Vision Guide is inspired by the idea that interacting with physical devices should be as intuitive as interacting with software. In science fiction, tools like the Star Trek tricorder instantly understand context and guide the user without manuals or prior training. In reality, working with machines still relies heavily on instructions that exist outside the moment of action.

This project explores how mixed reality on Meta Quest can bridge that gap by bringing understanding, guidance, and validation directly into the user’s workspace.

What it does

Vision Guide is a mixed reality guidance experience built for Meta Quest that adapts instructions based on the real-world state of the device a user is working on.

Using passthrough, Vision Guide observes the physical object, determines where the user is within a workflow, and overlays only the instructions needed at that moment. As the user interacts with the device, Vision Guide visually confirms each completed action before advancing, ensuring instructions remain accurate, contextual, and error-resistant.

How we built it

Vision Guide is built in Unity using a cross-platform architecture that allows the core experience to run across Android, iOS, and Meta Quest, while enabling platform-specific capabilities where needed.

On Meta Quest, the application leverages the Meta XR and OpenXR ecosystem to access passthrough camera data and anchor spatial guidance directly into the physical environment. All perception and action verification is designed to run on-device, allowing the system to respond immediately to user actions without external dependencies.

Challenges we ran into

One of the early challenges was working around the lack of direct passthrough camera access before official APIs were available. To continue development and experimentation, we implemented temporary solutions using media projection–based approaches to access camera feed data. While functional, these approaches required careful handling and had clear limitations.

As passthrough APIs became available, transitioning to official support introduced its own challenges around stability, performance tuning, and spatial alignment. Iterating across evolving platform capabilities while maintaining a consistent experience required continuous testing and adaptation.

Accomplishments that we're proud of

We are particularly proud of running multiple computer vision pipelines, including a real-time detection system, entirely on-device with usable latency. Achieving reliable, real-time visual understanding on a standalone headset enabled Vision Guide to deliver contextual instructions and verify user actions without breaking workflow.

Integrating on-device perception with spatial 3D overlays allowed the experience to feel responsive and practical, demonstrating that mixed reality guidance can be viable beyond controlled demos.

What we learned

This project reinforced the value of building alongside the platform as it evolves. Adopting new SDKs and platform releases early made it possible to unlock capabilities that were previously inaccessible and significantly changed what the application could achieve.

We also learned that even small, simple pieces of guidance can be impactful. Clear instructions—no matter how minimal—can meaningfully help someone who is new to a task or unfamiliar with a device. In mixed reality, the right instruction at the right moment matters more than the amount of information presented.

What's next for Vision Guide

The next step for Vision Guide is to explore newer form factors such as Meta’s glasses and future wearable devices, where hands-free, context-aware guidance becomes even more relevant.

Our goal is to evolve Vision Guide into an everyday utility—an experience people can rely on regularly across different tasks, environments, and walks of life, wherever real-world guidance is needed.

Credentials for testing username: vgmeta@visionguide.io password: vgteam2025

Built With

Share this project:

Updates