About the Project

Background & Inspiration

In museums, it’s easy to feel lost in front of hundreds of paintings—you don’t know where to start. You try to read wall labels, but crowds get in the way. Later, study materials end up scattered across countless folders and tabs. To solve the twin problems of hard to access in the moment and hard to study systematically afterward, we built a website for structured learning of artworks: take a photo or upload an image of a painting and instantly get the artist, title, year, and genre, plus similar works—turning fragmented knowledge into a coherent study path for art students.

One-Line Pitch

Upload a painting → instant identification → two structured JSONs (best match + top-3 similars) with visualization.

How We Built It

Data & Subset: Starting from a WikiArt-style dataset with classes.csv/metadata.csv, we stratified-sampled 5,000 images (optionally by genre), copied/downloaded them into a local subset, and cleaned duplicate genres (e.g., ["Naive Art Primitivism", ...] → "Naive Art Primitivism").
Representation Learning: We used CLIP (ViT-B/32) to encode each image into a 512-D vector, followed by L2 normalization.
Vector Search:
- Default: FAISS IndexFlatIP (fast and reliable for small/medium scales).
- When FAISS isn’t available on some macOS/Python combos, we fallback to NumPy cosine search, which is still snappy at 5k scale.
Backend: FastAPI + Uvicorn exposing /search/file and /search/url, returning:
- top_result.json — the best match;
- top3_similars.json — the three closest works.
Frontend (Tiny Website): Vanilla HTML/JS (no framework). Supports file/URL search, card-style visualization, one-click JSON download, and optional thumbnails (served safely via the API).
Kaggle Prototyping: On Kaggle we handled sampling/packaging/streamed indexing (memory-friendly) and versioned the index; the local service then loads that index directly.

Product Highlights

All key facts at a glance: artist / title / year / genre + similar works.
Structured outputs: two JSON files for immediate UI display, note-taking, or downstream analysis.
Visual + downloadable: card UI and one-click JSON export; load past JSONs for review.
Robust engineering: memory-friendly indexing on Kaggle, automatic macOS fallback when FAISS isn’t present, and filename slug parsing (artist_title-year) for reliable field extraction.

Challenges & Solutions

On-site access barriers: museum crowds → make upload/shot → identify the default flow to shorten time-to-info.
Fragmented materials: unify to standard JSON for classroom work, assignments, and personal knowledge bases.
Resource limits (Kaggle/local): batched embedding + streamed index building to avoid OOM.
Environment compatibility: FAISS wheels lag newer Python versions → NumPy cosine fallback keeps progress unblocked.

What We Learned

In end-to-end systems, data cleaning and naming conventions often matter more than micro-tuning the model.
A clear degradation path (FAISS → NumPy) dramatically reduces environment friction and improves demo/teaching reliability.
Small, stable frontends (zero dependencies) are great for teaching and quick deployment.

Who It’s For & Why It Matters

Art students: perform systematic comparison and style recognition—building a loop of search → read → record.
General visitors: get immediate, clean artwork info even in crowded galleries.
Teachers/TAs: share a prompt image before class, compare live in class, and preserve standardized JSONs for assignments and assessment.

What’s Next

Model upgrades (e.g., ViT-L/14) and multilingual labels.
Text re-ranking and fusion with tags/OCR to disambiguate near-lookalike works.
Mobile-friendly + offline packs (smaller encoders + quantization/PQ).
Learning paths: organize similars into style/school timelines and “must-read bundles.”

Built With

clip
css
faiss
fastapi
next.js
python
tilwind

Updates

Yunhan Zhang started this project — Sep 07, 2025 02:52 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.