Inspiration
FixItXR was inspired by the idea that car repairs should feel intuitive rather than confusing. Instead of relying on paper manuals, tutorials, or guesswork, we wanted a mixed-reality assistant that overlays instructions directly onto the real world. Using the Meta Quest 3’s cameras and spatial tracking, we envisioned a system that identifies engine components in real time and guides users with interactive labels and step-by-step instructions.
What it does
FixItXR automatically scans the engine bay and labels each component with accurate, real-time annotations. It then shows step-by-step tutorials on how to perform common repairs or maintenance tasks. Labels are anchored to the physical parts of the car, and tutorials are delivered through text, audio, and voice commands. The application can be used hands-free and works over Wi-Fi or hotspot.
How we built it
We built FixItXR using a combination of mixed reality tools, AI models, and spatial computing. Unity, together with the Meta Quest 3 passthrough and MR SDK, handled real-time spatial tracking, scene understanding, and anchoring virtual labels onto real engine components. We trained a Roboflow computer vision model capable of detecting key parts inside the engine bay, connected that with Gemini Vision. OpenAI powered the adaptive step-by-step guidance, reasoning and TTS instructions that respond to the user’s actions. We added localisation with the Gemini model for natural language translations and voice commands to use the app hands-free. By combining these systems, we created a seamless workflow that scans the environment, identifies components, and delivers clear, interactive repair tutorials.
We experimented with multiple deployment strategies:
- External Roboflow server: worked well but became too costly.
- Local offline inference on the headset (Sentis): functional but too slow, reducing FPS and causing lag.
- Gemini Vision with our trained model data: fast, cheap, and delivered high-quality detection through Wi-Fi, so we chose it for our current build.
We also added:
- localisation for English, French, Arabic, Polish and Spanish
- voice commands for all UI buttons in every supported language
- a grabbable menu that follows the user, can be moved, or locked in place
- a hands-free workflow
- scanning that provides correct labels on the engine bay
AI models deliver adaptive step-by-step summaries, instructions, voice commands supoort, and TTS feedback based on user actions. All systems run smoothly together, creating a unified MR experience.
Challenges we ran into
Our biggest challenge was getting computer vision to behave well in 3D. Pre-trained 2D models didn’t transfer to mixed reality, so we trained a custom model using many engine photos to improve accuracy. Coordinating multiple subsystems — detection, Unity MR interactions, AI tutorials, UI, and voice control — required precise timing to keep the experience responsive. Running models on-device also caused FPS drops, so we had to redesign our strategy and push inference through a fast network pipeline.
Accomplishments that we're proud of
We achieved reliable component detection with accurate, stable labels inside a real mixed-reality environment. The menu system is intuitive, follows the user, can be repositioned or locked in place, and supports full voice control in multiple languages. We built structured step-by-step guidance, natural text-to-speech instructions, and hands-free interaction.
Our custom model and integrated MR workflow were a major milestone for our team. Winning 1st Place at the Barcelona XR Hackathon validated our idea and motivated us to continue development.
What we learned
We learnt how challenging and iterative computer vision training can be — data quality matters more than quantity. We also gained deep experience debugging MR interactions, synchronising tasks between different systems, and prioritising performance on a mobile headset. Most importantly, we learnt how to collaborate: balancing personal life, deadlines, and experimentation while building something ambitious in a short time.
What's next for FixItXR
FixItXR is still a new and evolving project. We started during the sensAI hackathon in Barcelona (16–18 November 2025) and have continued development whenever time allows. Next steps include:
- adding more repair tutorials
- supporting more car systems: brakes, suspension, electrics, interior
- expanding object recognition to cover all parts, engines, and vehicle types
- recognition of boot space, tools, and consumables
- adding real-time error detection to prevent user mistakes
- future segmentation models for precise part boundaries
- more AI overlays, user action monitoring, safety guidance, and improved interactivity Our long-term vision is a comprehensive, user-friendly platform that helps anyone maintain and repair their vehicle confidently.
Team & Contributions
- Coding: Anna, Hemal
- Design: Anna
- Computer Vision: Chris
- Setup, testing & support: Jesus, Sarah
- Video creation: Sarah
- Language testing: Everyone




Log in or sign up for Devpost to join the conversation.