Inspiration
Every week I'd scroll TikTok or Instagram, see someone wearing a watch, ring, jacket or necklace I loved, and hit the same dead end: the creator never tagged the brand, comments were "wym this gorgeous," and a reverse image search gave me 40 dropshippers selling 40 different products. Even when I found the item, I had no idea how it would actually look on me.
onMe is the app I wanted: see something on someone, tap once, see it on yourself, tap again to buy the real one.
What it does
- Scan a look. Open the camera, paste a TikTok URL, or pick a photo from your camera roll.
- Detect every wearable. GPT-4o-mini reads the image and returns each item — watch, ring, necklace, earring, bag, outfit — with a description and a bounding box. Grounding-DINO then sharpens the bbox pixel-precisely.
- Find the real product. Each detection is queried against Google Lens and Yandex in parallel, social-media domains are stripped out, GPT-4o reranks the candidates against the item description, and the worker falls through to Google Shopping, eBay and Google Images so you always get a buy link.
- Try it on you. One tap pipes your saved body photo (wrist / neck / finger / ear / full-body) plus the product image into Perfect Corp's YouCam API. Render comes back in seconds, on you, in your room.
- Cop or drop. Save it to your bag, share it, or move on.
How we built it
- Mobile app: Expo + Expo Router (React Native + React Native Web). Single codebase ships to iOS, Android, and the in-browser preview judges use.
- Backend: A single Cloudflare Worker (
worker/index.ts) acts as the trust boundary. Every paid third-party API key (Perfect Corp, OpenAI, Replicate, SerpAPI) lives only in Worker secrets. The mobile client never holds a third-party key — only a sharedX-Onme-Tokenbearer that authenticates legitimate clients to the Worker, plus an image-host allowlist so the Worker only proxies URLs we trust. - Auth + storage: Supabase magic-link auth, deep-linked back into the app. Selfies live in a private Supabase Storage bucket gated by per-user RLS —
auth.uid()must match the top-level folder name. Try-on calls use short-lived signed URLs. - Try-on engine: Perfect Corp YouCam Online Editor API (
/s2s/...). Per-feature paths and versions differ —v3.0/task/clothfor outfits,v2.0/task/2d-vto/ringfor rings,v2.0/task/bagfor bags, etc. — wrapped behind a thin per-feature client. - Look-match pipeline: GPT-4o-mini for detection + descriptions, Replicate grounding-dino for precise bboxes, SerpAPI for Lens/Yandex/Shopping/eBay/Images, GPT-4o again as a reranker.
- Resilience: a 4-pass corrective retry on YouCam — if YouCam returns "couldn't detect body part," the
/diagnoseendpoint picks the next mutation strategy (re-crop, brighten, swap reference, fall back to a different body photo) and retries automatically. - Crash reporting: Sentry, gated
enabled: !__DEV__. - Distribution: EAS Build for native,
expo start --webfor the Devpost preview.
Challenges we ran into
- YouCam error "2". YouCam silently fails when the required body part isn't visible in the source photo (a face selfie can't try on a watch). We turned that failure mode into a feature: the
/diagnoseendpoint reads the error, picks a mutation strategy, andphotoMutator.tsretries up to 4 times — re-crop, swap to a different saved body photo, brighten, etc. - YouCam has no file-upload step. It takes public HTTPS URLs only. We upload the selfie to Supabase Storage first and pass a signed URL — which then forced us to add an image-host allowlist on the Worker so we don't become an open proxy.
- Per-feature paths and versions differ. Watch, ring, necklace, earring, bag, outfit each live on a different YouCam path and a different API version. We pulled the bash sample for each feature directly from the YouCam Playground rather than guessing.
- Lens results are 90% noise. Raw Lens output is full of TikTok mirrors, Pinterest pins, and dropshippers. We filter out social domains, then GPT-4o reranks the survivors against the item description, then we fall through to Shopping/eBay/Images. Every detection ends with at least one shoppable link.
- TikTok extraction from Cloudflare egress IPs gets rate-limited. Moved TikTok URL → video extraction to the client (residential IP) and kept the Worker endpoint as a fallback only.
- Pivoted stacks on day one. Started as a Next.js PWA, switched to Expo within the first 24 hours so we could use the native camera + ship through EAS without losing the web-preview path.
- Cost discipline. Perfect Corp's image-to-video endpoint is dramatically more expensive than static try-on. Video is gated behind an explicit "Make it a video" share action, never the default Try OnMe tap, so a viral demo can't blow the budget.
Accomplishments we're proud of
- End-to-end pipeline (scan → detect → shop → try-on) returns in seconds on a phone.
- Self-healing try-on: ~4× the success rate vs. naive single-shot YouCam calls, just from the diagnose-and-mutate retry loop.
- Zero secrets on the device — every paid API is proxied through one Worker with a token + host allowlist.
- Single Expo codebase ships to iOS, Android, and a working web preview that judges can open in any browser.
What we learned
- For LLM-based visual search, the reranker is the product. Lens + Yandex give you raw recall; GPT-4o + a domain blocklist + a category-aware Shopping fallback turns it into something a human will actually click.
- Grounding-DINO produces dramatically tighter crops than GPT vision bboxes — worth the extra model hop.
- A single fat Cloudflare Worker as the security boundary is shockingly nice DX for a mobile app: one deploy, one set of secrets, one place to add allowlists and rate limits.
- Expo Router + Supabase magic-link deep-linking is finally clean enough for production (the redirect URLs are still the part most likely to bite you).
What's next for onMe
- Replace the hardcoded catalog (
lib/feed.ts) with a Supabase table and a tiny admin UI. - Wire Stripe to the bag — today it persists but doesn't check out.
- Webhooks instead of polling on YouCam (the dashboard supports it; we left it on polling for the demo).
- Body-part coverage beyond wrist + neck + finger + ear: full-body outfit try-on works, but onboarding for it is rough.
- Cop/Drop social signal feeding back into the reranker so popular picks float to the top.
Log in or sign up for Devpost to join the conversation.