Inspiration
As a lifelong fan of Iron Man, I've always been captivated by JARVIS—the quintessential sci-fi AI assistant that not only understands but also anticipates needs in real-time. Inspired to turn this cinematic vision into reality, I created GARVIS. Leveraging Microsoft Azure's OpenAI GPT-4 Turbo Vision model, I've overcome the traditional limitations related to rate limits and the accuracy issues of previous image models. GARVIS isn't just a step towards the future—it represents a giant leap towards making advanced mixed reality assistants accessible to everyone.
What it does
GARVIS (Generative Azure Responsive Visual Interface System) revolutionizes the concept of virtual assistants through an innovative integration of augmented reality (AR) and artificial intelligence (AI). Utilizing Azure AI's capabilities, GARVIS goes beyond basic voice commands and text responses—it visually and audibly interacts with its environment. Whether users need navigation help, instructional overlays, or complex computational assistance, GARVIS perceives, analyzes, and augments reality to provide unparalleled interactive experiences.
How we built it
The construction of GARVIS involved a combination of several advanced technologies and platforms:
- Unity Engine: The core framework for creating immersive AR experiences.
- C# Scripts: Developed the main logic and interaction mechanisms.
- Meta Mixed Reality SDK & Meta Voice SDK: Provided robust mixed reality and voice processing capabilities.
- Azure AI: Utilized Microsoft's Azure OpenAI GPT-4 Turbo Vision for real-time image and text analysis, enhancing the system's interactive and predictive capabilities.
- Python, Flask, and ngrok: Set up a reliable local server environment for handling API requests effectively.
- Meta Quest 3: The hardware foundation that supported the deployment of this sophisticated AR application.
- Microsoft's CoPilot: Extensively used to write code, reformat logic, write documentation, and debug the countless Unity bugs that occurred. This tool was invaluable in enhancing productivity and ensuring project completion.
- GPT-3.5 Turbo API: Leveraged for function calling to enable situational real-time rendering of 3D models and in-game interactable objects, providing a dynamic and responsive user experience.
Challenges we ran into
One of the main challenges was adhering to privacy constraints, especially Meta's restrictions on capturing direct passthrough images. We innovated a workaround by capturing images through Oculus casting to a PC, which allowed us to process visual information without compromising user privacy.
Accomplishments that we're proud of
GARVIS is a pioneer—it's the first application to integrate Azure AI's advanced vision and language processing capabilities into a mixed reality environment:
- Innovative Integration: Seamlessly blending AI with AR to deliver a holistic and interactive user experience.
- Visual Intelligence: Harnessing the cutting-edge capabilities of Azure AI to understand and interact with the user's environment dynamically.
What we learned
This project deepened our appreciation of the complexities of merging AI with AR/VR technologies. We explored innovative solutions to integrate real-time AI processing within privacy constraints and gained valuable insights into user privacy considerations.
What's next for GARVIS
- Enhancing Privacy Features: We are committed to continuously improving our data privacy protocols to ensure user trust and safety.
- Expanding AI Capabilities: Future updates will include more sophisticated AI features to enhance GARVIS's utility and responsiveness.
- Broadening Market Reach: We aim to make GARVIS accessible across various sectors such as education, healthcare, and enterprise, transforming how professionals interact with technology in their fields.
Forget about paying $800 plus $24/month for a Humane Pin. With GARVIS, you gain an all-encompassing AI assistant that doesn’t just tell you but shows you—bringing the power of AR visualizations and interactive assistance into your hands at a fraction of the cost.


Log in or sign up for Devpost to join the conversation.