Inspiration

Every day we see cars we are fascinated by, but don't know the details of. Inspired by this, we decided to create an application that can help us with this.

What it does

This app can use a picture of a car and determine the make and model of the car using computer vision. Additionally, using LLMs, users can input questions about the car and get more information and a summary about the car.

How we built it

We use Flask, a fairly lightweight Python API to connect the Flutter frontend to the Python based machine learning backend. Additionally, we leverage Flutter, a framework for building apps for many platforms, although we developed our app with iOS in mind. The Flutter app sends http requests to the Flask backend, which contain the image and the prompt. the API then calls the fast ai library and responds with text that is displayed on the app. On the AI side, we fine-tune a pre-trained DenseNet CNN model on the Stanford Cars Image Dataset for classifying car make and model. The dataset consists of 16,185 images and 196 classes of cars. We evaluated multiple different model architectures, including Vgg ResNet and DenseNet architectures, and DenseNet 201 achieved 85% validation and test accuracy. Additionally, we leveraged a distilled 81.5 M parameter RobertA model fine-tuned for question answering. After the CNN predicts the car make and model for an input image, data on that car is scraped from Kelly Blue Book website and fed as a prompt to the distilled RobertA model along with a user question. Our LLM then outputs an answer based on the user question and car data.

Challenges we ran into

One challenge we ran into was scraping data about the car from the Internet for prompting the LLM. At first, we leveraged data from Google searches about the car but were unable to get sufficient information for the LLM to perform question answering with good perform. We tackled this by instead scraping data from a particular car website and defining different pattern matching rules to accurately scrape the data. The result was that it significantly improved the question answering ability of the LLM.

Accomplishments that we're proud of

We are proud that our computer vision models can achieve 85% accuracy. In addition, we are satisfied that our LLM can answer many different user questions.

What we learned

We all gained a lot of practical experience in backend development, Flutter, Flask, computer vision, and NLP. Everyone contributed to each part of the project.

What's next for CarScope

We want to optimize the inference speed of LLM generation and serve LLMs better with vLLM. We also want to fine-tune the LLM on car description data for better question answering.

Built With

Share this project:

Updates