RareDex — Collecting the World Around Us

Inspiration

RareDex started from a simple observation: people already take photos of everything they see, but those photos usually stop at personal storage or social feeds. There’s no sense of progress, completion, or shared understanding of what those images represent.

We were inspired by collection-driven systems like Pokédex-style games, but wanted to ground the experience in the real world. Instead of collecting fictional creatures, what if users could collect objects they’ve actually encountered—street signs, gadgets, tools, everyday items—and slowly build a structured record of the world around them?

At the same time, we wanted to avoid explicitly framing things as “rare” or “valuable.” The goal wasn’t scarcity—it was curiosity. RareDex is about noticing, identifying, and organizing what you see.

What We Built

RareDex is a web-based discovery and collection game.

Users:

Take a photo using their device camera
Upload it to the platform
Let AI automatically classify the object into a coarse category
Select a more specific label to finalize the entry
Unlock that object in their personal collection
Explore a scrolling discovery feed of what others have found

To keep the system clean and trustworthy:

Submissions go through an AI-based verification step
The community can report incorrect entries
If a post is reported three times, it is automatically removed from the feed

The result is a living, crowdsourced collection system that blends automation with human judgment.

How We Built It

Architecture Overview

We split the system into three clean layers:

Frontend: React + TypeScript, deployed on Vercel
Backend / Data Layer: Supabase (Postgres, Auth, Storage)
ML Services: Deployed endpoints that call Gemini for classification and verification

The key design choice was to treat every photo as a submission lifecycle rather than a single atomic action.

Submission Lifecycle

A user takes a photo
A submissions row is created immediately with status = 'pending'
The image is uploaded to Supabase Storage using the submission UUID
An ML service performs coarse classification
The user selects a fine-grained label
A second ML service verifies the match
Only after verification does the submission become status = 'active' and appear in the feed

This decoupling allowed us to:

Get stable IDs early
Avoid half-finished posts in the feed
Retry ML safely without duplicating data

Community Moderation

To balance automation with trust, we implemented a lightweight reporting system:

Users can report incorrect submissions
Each report increments a counter
Once report_count <= 3, the submission is automatically rejected

This is enforced at the database level using triggers, ensuring correctness even if the frontend fails.

Challenges We Faced

Open-World Image Classification Is Harder Than It Looks

Our core ML challenge was that RareDex is open-vocabulary and user-driven. Unlike standard image classification tasks, we were not classifying against a fixed, closed label set. Users can encounter anything, and the same object can appear under wildly different lighting, angles, backgrounds, and contexts.

We experimented with several approaches:

YOLO-style object detection
Good at detecting where objects are, but weak for semantic understanding. YOLO struggled when:
- the object filled most of the frame
- the object class was not in the pretrained label set
- multiple plausible object interpretations existed
CLIP-style image–text embeddings
CLIP was powerful for semantic similarity, but introduced ambiguity:
- Similar objects (e.g. “remote”, “controller”, “gamepad”) often clustered tightly
- Prompt sensitivity caused unstable predictions
- Confidence calibration was difficult — cosine similarity alone is not a probability

We learned quickly that no single model was sufficient. Detection models lack semantics; embedding models lack grounding.

The final design shifted toward a two-stage human-in-the-loop pipeline:

ML performs coarse semantic narrowing
Users provide the fine-grained classification
A second ML pass verifies consistency between image and user-selected label

This reframing turned ML from a brittle oracle into a decision-support system, which dramatically improved robustness.

Verification vs Classification

A key insight was separating classification from verification.

Instead of asking:

“What is this object?”

We asked:

“Does this image plausibly match the label the user selected?”

This is a fundamentally easier problem.

This reframing allowed us to:

tolerate noisy user input
reject confidently wrong labels
keep ML latency low
reduce hallucinated certainty

Async ML Pipelines and State Consistency

ML inference is slow, asynchronous, and failure-prone. Running multiple models in sequence (classification → verification) exposed us to state explosion problems:

submissions partially classified
retries producing duplicate writes
users navigating away mid-inference
race conditions between ML responses and UI updates

Our initial implementation tightly coupled “submission” with “publication,” which caused unfinished entries to leak into the feed.

The fix was architectural: separate existence from visibility.

We introduced a strict submission lifecycle:

pending: image exists, ML and user actions in progress
active: verified and safe to display
rejected: failed verification or community takedown

ML Meets Production Reality

Finally, deploying ML-backed features surfaced issues that don’t appear in notebooks:

browser image preprocessing differences (JPEG orientation, resizing)
inconsistent camera metadata across devices
latency hiding and UI feedback
secure key management (no client-side inference)
deterministic fallbacks when ML endpoints fail

We learned that production ML is as much systems engineering as modeling. Most of the difficulty wasn’t model accuracy — it was orchestration, trust boundaries, and failure handling.

By the end, RareDex had evolved from “an image classifier” into a distributed ML system with human verification, lifecycle control, and community moderation — far more technically challenging, and far more robust.