FixItXR

team
photo2
segmentation
photo3
trello
test
training data

Inspiration

FixItXR was inspired by the idea that car repairs should feel intuitive rather than confusing. Instead of relying on paper manuals, tutorials, or guesswork, we wanted a mixed-reality assistant that overlays instructions directly onto the real world. Using the Meta Quest 3’s cameras and spatial tracking, we envisioned a system that identifies engine components in real time and guides users with interactive labels and step-by-step instructions.

What it does

FixItXR automatically scans the engine bay and labels each component with accurate, real-time annotations. It then shows step-by-step tutorials on how to perform common repairs or maintenance tasks. Labels are anchored to the physical parts of the car, and tutorials are delivered through text, audio, and voice commands. The application can be used hands-free and works over Wi-Fi or hotspot.

How we built it

We built FixItXR using a combination of mixed reality tools, AI models, and spatial computing. Unity, together with the Meta Quest 3 passthrough and MR SDK, handled real-time spatial tracking, scene understanding, and anchoring virtual labels onto real engine components. We trained a Roboflow computer vision model capable of detecting key parts inside the engine bay, connected that with Gemini Vision. OpenAI powered the adaptive step-by-step guidance, reasoning and TTS instructions that respond to the user’s actions. We added localisation with the Gemini model for natural language translations and voice commands to use the app hands-free. By combining these systems, we created a seamless workflow that scans the environment, identifies components, and delivers clear, interactive repair tutorials.

We experimented with multiple deployment strategies:

External Roboflow server: worked well but became too costly.
Local offline inference on the headset (Sentis): functional but too slow, reducing FPS and causing lag.
Gemini Vision with our trained model data: fast, cheap, and delivered high-quality detection through Wi-Fi, so we chose it for our current build.

We also added:

localisation for English, French, Arabic, Polish and Spanish
voice commands for all UI buttons in every supported language
a grabbable menu that follows the user, can be moved, or locked in place
a hands-free workflow
scanning that provides correct labels on the engine bay

AI models deliver adaptive step-by-step summaries, instructions, voice commands supoort, and TTS feedback based on user actions. All systems run smoothly together, creating a unified MR experience.

Challenges we ran into

Our biggest challenge was getting computer vision to behave well in 3D. Pre-trained 2D models didn’t transfer to mixed reality, so we trained a custom model using many engine photos to improve accuracy. Coordinating multiple subsystems — detection, Unity MR interactions, AI tutorials, UI, and voice control — required precise timing to keep the experience responsive. Running models on-device also caused FPS drops, so we had to redesign our strategy and push inference through a fast network pipeline.

Accomplishments that we're proud of

We achieved reliable component detection with accurate, stable labels inside a real mixed-reality environment. The menu system is intuitive, follows the user, can be repositioned or locked in place, and supports full voice control in multiple languages. We built structured step-by-step guidance, natural text-to-speech instructions, and hands-free interaction.

Our custom model and integrated MR workflow were a major milestone for our team. Winning 1st Place at the Barcelona XR Hackathon validated our idea and motivated us to continue development.

What we learned

We learnt how challenging and iterative computer vision training can be — data quality matters more than quantity. We also gained deep experience debugging MR interactions, synchronising tasks between different systems, and prioritising performance on a mobile headset. Most importantly, we learnt how to collaborate: balancing personal life, deadlines, and experimentation while building something ambitious in a short time.

What's next for FixItXR

FixItXR is still a new and evolving project. We started during the sensAI hackathon in Barcelona (16–18 November 2025) and have continued development whenever time allows. Next steps include:

adding more repair tutorials
supporting more car systems: brakes, suspension, electrics, interior
expanding object recognition to cover all parts, engines, and vehicle types
recognition of boot space, tools, and consumables
adding real-time error detection to prevent user mistakes
future segmentation models for precise part boundaries
more AI overlays, user action monitoring, safety guidance, and improved interactivity Our long-term vision is a comprehensive, user-friendly platform that helps anyone maintain and repair their vehicle confidently.

Team & Contributions

Coding: Anna, Hemal
Design: Anna
Computer Vision: Chris
Setup, testing & support: Jesus, Sarah
Video creation: Sarah
Language testing: Everyone

Built With

gemini
handtracking
metasdk
openai
passthrough
pca
robofow
unity

Submitted to

Meta Horizon Start Developer Competition

Created by

My work spanned the full project lifecycle: shaping the concept, designing the UI/UX, resolving technical issues, programming most new functionalities, implementing OpenAI guidance and Gemini localisation & Vision, and producing all tutorial materials and video content.

Anna Zielinska
I set up the initial project, developed core tutorial functionality and supported the team across a range of XR development tasks including implementing voice command features.

I also helped tackle integration challenges, and established a structured workflow by introducing a task-tracking system and defining key goals.

The team was fantastic to work with, and despite some demanding periods the experience was both enjoyable and enriching.

Hemal Bodasing
Chris Reda
Sarah Imdad
Jesus Hernandez