Cook Master is a native Unity Mixed Reality application powered by Gemini 3’s advanced multimodal intelligence, capable of recognizing real-world ingredients, generating personalized recipes for users to choose from, providing visualized step-by-step cooking guidance, and offering a multimodal assistant to support users throughout the cooking process. The system is built around three core modules: Perception, Intelligence, and Interaction.
Eyes: Gemini-Powered Passthrough Vision
Cook Master captures real-time video from the headset camera, enabling Gemini 3 to understand the physical environment and recognize ingredients directly on the countertop, even in messy, unstructured cooking scenarios. Based on this perception, Gemini dynamically matches available ingredients with suitable, personalized recipes.
Brain: Gemini Multimodal Reasoning
At the core of Cook Master, Gemini 3 serves as the reasoning and generation engine, performing:
- Structured recipe generation (name, introduction, steps)
- Gemini Multimodal content generation (textual instructions + visualized step images)
Gemini supports text and image multimodal inputs, allowing users to interact naturally at any moment and receive contextual, real-time guidance.
Hands: Controller-Free Natural Interaction
We designed a fully controller-free interaction system, mapping core functions to intuitive semantic gestures:
- Open Hand: Start a voice conversation with Gemini
- Thumbs Up: Capture a photo and engage Gemini with multimodal (vision + voice) assistance
This enables hands-free operation, allowing users to continue cooking naturally without touching devices or interrupting their workflow.
Cook Master is more than just a recipe app; it represents a future where Gemini-powered systems understand the physical world, reason in context, and provide intelligent, real-time assistance, transforming everyday activities into seamless spatial experiences.
Log in or sign up for Devpost to join the conversation.