Inspiration
Introduction: Ever had a car issue that you have no idea to fix? You take it into the shop, wait for a few hours, and then get hit with an astronomical bill—often costing hundreds, if not thousands, of dollars—to fix superficial issues. The thing is, you're not alone. Thousands of people around the world face the same frustrations, not only with struggling to identify what is wrong with their cars, but also with being overcharged by auto repair shops for services that could easily be handled on their own.
The Problem: A recent survey of 1,000 randomly selected U.S. adults reveals alarming insights about car maintenance habits: 46% of drivers admit they’ve paid for a repair that could have been avoided with better upkeep [1]. It’s mainly because people don’t actually know how to care for their cars (whether that be through regular oil changes, brake replacements, or other maintenance work), leading to excessive spending on repairs that, for many, might be financially out of reach.
Customer Insights: See Infographic Above
Our Solution
This report envisions a future where augmented reality (AR) and large language model (LLM) technology becomes integral in our everyday lives (potentially within the next few years). As hardware continues to evolve, AR devices will become lighter and easier to use, spreading throughout homes, workplaces, and even car maintenance. Leading companies like Google and Meta are already exploring how AR and LLM technologies can work together. With this shift, learning and problem-solving will be more accessible, enabling anyone to easily engage with and master a wide range of skills. Our project is aimed at predicting the possibilities that exist in this world and using them to solve the problem at hand.
Introducing AutoLeARn, an interactive AR-based learning platform designed to guide and educate car owners through essential maintenance tasks, such as oil changes, brake replacements, and more. AutoLeARn empowers users by helping them learn to maintain their vehicles, prolong their lifespan, and minimize costly repairs by providing clear, comprehensive guidance at every step. The device uses augmented reality to overlay visual instructions directly onto the car, ensuring users can follow along with ease (even while using their tools).
What it does (key features):
- Speech-to-Text Integration: Simply specify your car details and the service you want to perform using speech commands. This is further enabled by intuitive hand gestures, allowing users to quickly select and customize tasks.
- Interactive Step-by-Step Guidance: Take a screenshot using a hand gesture, and AutoLeARn returns a sequence of steps to perform the service. Each step is displayed in a panel, where users can swipe left and right to view different instructions. Every panel includes a step title and detailed description, ensuring the user follows along with ease.
- Integrated Video Assistance: If more guidance is needed, AutoLeARn provides an option to download and watch a tutorial video. The video is pulled based on the task at hand, and the video player is incorporated into a panel with rotation and translation manipulators for ease of use.
- Step Contextualization: For added convenience, AutoLeARn can point users to specific instructions based on the current step in the repair process. This feature guides users to the exact step they need (skipping over any parts that AutoLeARn detects the user has already completed).
With AutoLeARn, users can educate themselves with hands-on, personalized guidance as they learn the process of maintaining their cars, reducing repair costs, and increasing their confidence in car care.
How we built it
- Speech-to-Text Integration: Leveraging the Unity SDK audio system, and then passing that into Groq Text-To-Speech [using Whisper], we extract the car model, make, year, and the issue with the car through the user’s speech prompt. We take all this information, save it as JSON, and send it back to Unity and save that information within the app. We used all three Groq modalities in this project, which was a new experience for us, especially considering this has never been done before in XR.
- Interactive Step-by-Step Guidance: We are also using Groq's image-to-text understanding to extract context from what the user is seeing in real time. Through a thumbs-up hand pose analyzed by the Meta Interaction SDK Computer Vision, we trigger a request for a screenshot to capture the car the person is working on through their lens. We send that back-end API to be analyzed by a Vision Language Model (VLM). We then take the text description generated by the VLM and send that to Perplexity (which allows us to do deep research to figure out the necessary steps for the user to follow in the car repair process). We then utilize Groq again to take the response from Perplexity and turn it into a structured JSON, which can then be fed back into Unity for the canvas steps panel display.
- Integrated Video Assistance: Developed an API to take a Perplexity video citation and download a video tutorial, which is then parsed into the step-by-step segments for maintenance/repair. This required parsing of auto-generated subtitles and importing it back into a reasoning model to split video into informative steps based on the instructions.
- Step Contextualization: Build as an additional feature to the step-by-step guidance. Leverages the image description to identify which step the user is on and, from there, skip to the current state of the repair process.
Challenges we ran into
Due to interface issues between unpredictable LLM output and XR interaction, the feature of having a video ready for any step of the user experience had to be simplified to complete the project within the time frame.
Integrating XR with multiple LLM features and video tutorials, due to the unpredictability of LLM response and current material availability, was a challenge. We were faced with limitations of the Meta Quest 3 which did not include full support for features like display image capture. We had to develop workarounds with a special screen capture API.
Also, we attempted to implement an AI agent-based system but faced limitations in how we wanted the LLM to use the provided tools. Through many iterations, we realized that incorporating AI agents within our project’s scope and timeline would not be feasible. After reaching this conclusion, we simplified our AI system to a chain of LLMs and VLMs that work together to generate steps for the user to follow.
Accomplishments that we're proud of
We are proud of creating a project that fuses the worlds of AR and advanced AI while innovating an impactful product. With a team of hackers with mixed experience we were able to work well together as a team and learn a lot collectively.
What we learned
With a variety of experience going in, we all learned different skills and got to test different tools for the first time. Some learned about API interfacing and testing LLMs such as Groq, Gemini, OpenAI, Deepseek, and Perplexity. Some learned a lot about how development in XR works. We learned a lot about limitations of the technology we work with and implementing innovative workarounds like a special API for image capture on the Meta Quest 3 headset.
Looking Forward
Impact: AutoLeARn will dramatically simplify car maintenance for users, making it both easy and accessible for anyone to take charge of their vehicle’s care. With the power of AR, users no longer need to stop in the middle of a repair to search for instructions or watch complicated video tutorials. Instead, they can receive real-time, step-by-step guidance right in front of them, without having to put down their tools. The hands-free experience eliminates distractions and interruptions, making repairs quicker, more effective, and less stressful. By educating car users with this level of immediate, intuitive assistance, AutoLeARn not only reduces the need for costly professional help, but also gives users the skills and knowledge they need to maintain their cars independently.
Other Applications:
1. Helping Government-Run Centers in Training New Car Mechanics Another application of AutoLeARN is the key role it can play in helping government-run training centers who are responsible for training new car mechanics. As noted by industry experts, a major challenge faced by several government agencies responsible for training car mechanics is that they must constantly update their teaching materials to keep up with the rapid advancements in technology (whether that be with the rise of electric vehicles, hybrid cars, and/or increasingly complex trucks) [5]. AutoLeARn can help these government training programs evolve to cover a wider range of systems and tools, providing a better and more effective education.
2. Performing Safety and Functionality Inspections for Car Dealerships & Used Car Lots AutoLeARn also has great potential for use in safety and functionality inspections at car dealerships and used car lots. The analyses provided by AutoLeARn can help these dealerships and used car lots in providing quick appraisals of cars and checking critical components to make sure that the vehicles meet the necessary standards before being sold. This not only makes the inspection process more efficient for dealerships, but will also improve trust and customer satisfaction as customers can feel confident knowing that the vehicle has been thoroughly checked using the latest standards and technology.
References (primarily from the customer insights section for background information)
[2] 2024 survey published by the Auto Care Association using analysis from Hanover Research (https://hedgescompany.com/blog/2024/07/characteristics-automotive-diy-consumers/).
[3] Car Parts July 2023 Website Survey (1500 respondents - https://www.carparts.com/blog/ready-to-roll-do-you-truly-know-your-car-repair-proficiency/).
[4] 2020 SimpleTire Website Survey (1000 respondents - https://simpletire.com/press/releases/ simpletire-new-survey-car-owners-steering-towards-DIY).
[5] https://autosphere.ca/dealerships/2023/05/25/the-challenges-of-automotive-training/.

Log in or sign up for Devpost to join the conversation.