About the Project

Background & Inspiration

In museums, it’s easy to feel lost in front of hundreds of paintings—you don’t know where to start. You try to read wall labels, but crowds get in the way. Later, study materials end up scattered across countless folders and tabs. To solve the twin problems of hard to access in the moment and hard to study systematically afterward, we built a website for structured learning of artworks: take a photo or upload an image of a painting and instantly get the artist, title, year, and genre, plus similar works—turning fragmented knowledge into a coherent study path for art students.

One-Line Pitch

Upload a painting → instant identification → two structured JSONs (best match + top-3 similars) with visualization.

How We Built It

  • Data & Subset: Starting from a WikiArt-style dataset with classes.csv/metadata.csv, we stratified-sampled 5,000 images (optionally by genre), copied/downloaded them into a local subset, and cleaned duplicate genres (e.g., ["Naive Art Primitivism", ...]"Naive Art Primitivism").
  • Representation Learning: We used CLIP (ViT-B/32) to encode each image into a 512-D vector, followed by L2 normalization.
  • Vector Search:

    • Default: FAISS IndexFlatIP (fast and reliable for small/medium scales).
    • When FAISS isn’t available on some macOS/Python combos, we fallback to NumPy cosine search, which is still snappy at 5k scale.
  • Backend: FastAPI + Uvicorn exposing /search/file and /search/url, returning:

    • top_result.json — the best match;
    • top3_similars.json — the three closest works.
  • Frontend (Tiny Website): Vanilla HTML/JS (no framework). Supports file/URL search, card-style visualization, one-click JSON download, and optional thumbnails (served safely via the API).

  • Kaggle Prototyping: On Kaggle we handled sampling/packaging/streamed indexing (memory-friendly) and versioned the index; the local service then loads that index directly.

Product Highlights

  • All key facts at a glance: artist / title / year / genre + similar works.
  • Structured outputs: two JSON files for immediate UI display, note-taking, or downstream analysis.
  • Visual + downloadable: card UI and one-click JSON export; load past JSONs for review.
  • Robust engineering: memory-friendly indexing on Kaggle, automatic macOS fallback when FAISS isn’t present, and filename slug parsing (artist_title-year) for reliable field extraction.

Challenges & Solutions

  • On-site access barriers: museum crowds → make upload/shot → identify the default flow to shorten time-to-info.
  • Fragmented materials: unify to standard JSON for classroom work, assignments, and personal knowledge bases.
  • Resource limits (Kaggle/local): batched embedding + streamed index building to avoid OOM.
  • Environment compatibility: FAISS wheels lag newer Python versions → NumPy cosine fallback keeps progress unblocked.

What We Learned

  • In end-to-end systems, data cleaning and naming conventions often matter more than micro-tuning the model.
  • A clear degradation path (FAISS → NumPy) dramatically reduces environment friction and improves demo/teaching reliability.
  • Small, stable frontends (zero dependencies) are great for teaching and quick deployment.

Who It’s For & Why It Matters

  • Art students: perform systematic comparison and style recognition—building a loop of search → read → record.
  • General visitors: get immediate, clean artwork info even in crowded galleries.
  • Teachers/TAs: share a prompt image before class, compare live in class, and preserve standardized JSONs for assignments and assessment.

What’s Next

  • Model upgrades (e.g., ViT-L/14) and multilingual labels.
  • Text re-ranking and fusion with tags/OCR to disambiguate near-lookalike works.
  • Mobile-friendly + offline packs (smaller encoders + quantization/PQ).
  • Learning paths: organize similars into style/school timelines and “must-read bundles.”

Built With

Share this project:

Updates