Inspiration Honestly, it started with a frustration most of us share — ordering three sizes of the same shirt just to send two back. Online fashion returns are a quiet disaster, both for shoppers and for the planet, and a static product photo on a model who isn't you only does so much. We kept thinking: phones already have great cameras and decent GPUs, so why are we still guessing?

What it does AR-Tryon is a webshop where you can point your camera at yourself and see garments wrap onto your body in real time. It tracks 33 body landmarks via MediaPipe, fits 3D GLB models per slot (head, upper, lower, shoes, full-body), and lets you swap colors, snap photos, and get a size recommendation from your measurements. There's also a small admin area for generating new GLBs and viewing demo analytics.

How we built it Frontend is Vite + React + TypeScript with Tailwind and shadcn for the UI, Three.js for rendering garments through an orthographic camera, and MediaPipe Tasks Vision for pose detection. State lives in Zustand. The backend is split intentionally: a lightweight FastAPI public service for analytics and the garment manifest, and a separate, isolated GLB generation service that stays off the public network and only runs when an admin enables it. Everything ships in Docker with an ngrok tunnel so it runs anywhere.

Challenges we ran into Pose-to-garment alignment was harder than it looked. The mirrored selfie view introduced a sneaky double-negation in our rotation math, so shirts tilted the wrong way for a while. Pants kept sitting too high on the torso until we lifted the waistband above the hip landmarks. Skewing a flat-feeling GLB to follow shoulder-hip lean without it looking rubbery took a lot of tuning. And keeping ~30fps on mobile while doing pose inference, smoothing, and WebGL rendering meant cutting corners we didn't want to cut.

Accomplishments that we're proud of The drop-a-GLB-and-it-just-shows-up pipeline — prefix the file with UPPER_, LOWER_, etc., and the manifest regenerates automatically. The fact that you can stack head + upper + lower + shoes and they all track together. Color swapping on the t-shirt feels genuinely fun. And the architecture decision to isolate the GPU-heavy generation service from the public storefront means scaling one doesn't drag the other down.

What we learned AR fit is 20% math and 80% taste — the numbers can be "correct" and it still looks off. Mirroring is a trap; pick a coordinate convention early and write it on the wall. Also: shipping a working demo beats shipping a perfect one, and isolating risky services (looking at you, TRELLIS) early saves a lot of pain later.

What's next for AR Clothing Try-On Real cloth simulation instead of skewed meshes, so fabric actually drapes. Better depth estimation — probably leaning on WebXR where it's available. Persistent measurements tied to a user account so size recommendations get smarter over time. Wiring up real TRELLIS generation on a GPU host so brands can upload a product photo and get a wearable GLB back. And eventually, a proper checkout so the loop from "try" to "buy" closes inside the same session.

Built With

Share this project:

Updates