Problem Statement

Identifying a car's exact make, model, and generation from a photo is genuinely hard — even for enthusiasts. Subtle styling differences between generations or trim levels are easy to miss, and there is no simple, accessible tool that lets a regular person snap or paste an image and get a confident answer.

DriftID solves this by combining a state-of-the-art vision backbone with a lightweight classifier to give anyone fast, fine-grained car identification directly in a browser — no ML knowledge required.


Solution Overview

DriftID is an end-to-end ML web application. At its core, a pretrained DINOv3 Vision Transformer (ViT) backbone acts as a frozen feature extractor, converting any car image into a compact 384-dimensional embedding that captures rich visual structure. A linear classifier trained on top of those frozen embeddings then maps the embedding to one of ~196 fine-grained car classes (e.g., Audi A7 Gen 2014–2017 or BMW X5 Gen 2019–2020).

The pipeline is exposed through a FastAPI REST backend that loads the model once at startup and serves predictions over HTTP. A Flutter Web frontend provides the user-facing interface — letting users upload an image from disk or paste a URL, then displaying the top-k predictions with confidence scores. Prediction history is saved locally so users can revisit past results without re-running inference.

User image (upload or URL)
      ↓
Preprocessing (384×384, timm normalization)
      ↓
DINOv3 ViT backbone  →  384-d embedding
      ↓
Linear classifier  →  class logits
      ↓
Top-k predictions + confidence scores
      ↓
FastAPI  →  Flutter Web UI

Key Features

Search

  • Image upload — pick a local file to run inference on
  • URL input — paste a public image URL; the backend fetches and classifies it directly
  • REST API — clean HTTP endpoints (/predict, /predict-url, /health) usable outside the UI

Results

  • Top-5 predictions with confidence scores — returns the most likely car classes ranked by probability
  • Prediction history — results are auto-saved to local browser storage so users can browse and reopen past identifications.

Cosmetic (For the real ones)

  • Color theme control — light and dark mode support via a settings store

Technologies Used

ML / Backend

Layer Technology
Vision backbone DINOv3 ViT (vit_base_patch16_dinov3) via timm
Image transforms timm data config (384×384, ImageNet normalization)
Classifier PyTorch nn.Sequential MLP trained on frozen embeddings
Embedding index FAISS (used during training for feature storage/retrieval)
API framework FastAPI + Uvicorn
Language Python 3.10
Environment Conda (gpu-env), CUDA-capable

Frontend

Layer Technology
UI framework Flutter Web (Dart)
HTTP client http package
File picking file_picker
Local persistence shared_preferences (localStorage-backed on web)
E2E testing Playwright

Dataset

Item Detail
Source Car Make, Model, and Generation — Kaggle
Size 41,521 images
Classes ~196 fine-grained categories (make + model + generation)
Split 80% train / 20% test

Infrastructure / Tooling

  • Docker Dev Container — reproducible dev environment with Flutter SDK, Chromium, and gpu-env pre-installed
  • HuggingFace - used to host app's live demo: DriftID Demo

Target Users

Two primary personas drove the product design:

Car enthusiast — someone who enjoys identifying vehicles by sight. They want a fast, frictionless way to confirm or challenge their own identification of a car from a photo, and care about accuracy on subtle generational differences.

Used-car shopper — someone browsing listings who encounters an unfamiliar vehicle. They want a quick answer to "what exactly is this car?" without needing automotive knowledge, and they value readable labels and a clean interface that does not get in the way.


Team Members

  • Kien Nguyen (Developer): Machine Learning Pipeline, Model Development, Data Processing
  • Ray Wang (Developer): Flutter Web, API Backend, UI/UX Design

Built With

Share this project:

Updates