🪞 Try On AI
Inspiration
The global fashion e-commerce market exceeds $820 billion, yet it's plagued by a fundamental, unsolved problem: shoppers can't try clothes on before they buy. The result is a staggering 20% apparel return rate that costs U.S. retailers over $150 billion annually in reverse logistics, restocking, and lost margin. For shoppers, it's an equally frustrating cycle of guessing wrong and sending things back.
The spark for Try On AI came from two real-world perspectives on our team: the burned online shopper and the retail floor insider.
As customers, we were tired of the guesswork — static 2D product photos, inconsistent sizing, and a hope-for-the-best checkout button. Jonas brought the other half of the picture. Having worked on the sales floor at Old Navy, he saw firsthand how fitting rooms bottleneck brick-and-mortar retail: long queues, stretched staff, and a checkout experience that hasn't meaningfully changed in decades.
Existing solutions like Amazon's style features and Snap's AR try-on are shallow overlays that put the burden on the user to pose around the garment to avoid clipping, rather than the other way around. They fall far short of replicating a real fitting room experience. We saw an opportunity to build something genuinely different — a photorealistic, generative AI-powered virtual fitting room that works for both the online shopper and the in-store customer.
What It Does
Try On AI is an interactive smart mirror that lets you see yourself wearing any outfit — before you buy it.
Open the app, allow webcam access (or upload a selfie), and browse the clothing catalog. Select a garment, and within seconds Google Gemini edits a photo of you to show exactly how that item looks on your body. No guessing. No returns.
On top of the try-on, a backend AI stylist agent evaluates your selection and automatically recommends matching pieces — pants, jackets, footwear — pulled from live retail catalogs. A voice agent lets you browse hands-free, and a vision agent tracks what you're already wearing to make smarter, personalized recommendations.
The result is a single platform that serves two distinct customers: the online shopper who wants purchase confidence, and the retail operator who wants to cut return rates and modernize the in-store experience.
How We Built It
The project is a monorepo with several decoupled layers:
- try-on-react — React + Vite frontend with a glassmorphic, sci-fi HUD aesthetic. Handles webcam capture, catalog browsing, and try-on result display.
- my-agent — FastAPI backend wrapping Google Gemini's image editing API via a NanoBananaProcessor pipeline. Accepts a user photo + garment image and returns a photorealistic composite.
- style-finder — AI stylist agent that reads the user's current selection and queries live merchant catalogs to surface coordinating recommendations.
- vision-agent — Computer vision layer that analyzes what the user is already wearing to personalize suggestions.
- GetStream Video SDK — Handles WebRTC streaming and custom event passing between the browser and Python agent in full integration mode.
Challenges We Ran Into
Gemini image editing latency. A high-quality try-on takes 15–30 seconds to generate. We designed the UX around that wait — a sci-fi "scanning" HUD animation, and pop-ups telling the users exactly what's happening, keeps users engaged and makes the processing feel intentional rather than broken.
Multi-agent coordination. Wiring together the try-on pipeline, stylist agent, voice agent, and vision agent so they share context without stepping on each other required significant architecture work across the monorepo.
Balancing ambition with cost. Our initial plan was to use GetStream's Decart Lucy model for real-time video try-on, but the per-frame API costs scaled faster than we could manage. We fell back to a Gemini-powered photo booth approach instead, which turned out to produce more photorealistic results and a far more sustainable cost structure for the actual product.
Github organization. As we scaled our project up we found that the monorepo organization was very messy and inefficient. At one point, we had to run 3 separate commands, each in different files of our project, just to get this to work, and Misa helped create a script that does this simultaneously. Nonetheless, this is useful time that could have been utilized to expand upon features. Furthermore, this made figuring out the correct github actions (merging, pulling, pushing, etc) difficult to figure out, costing us time on the project organization.
Accomplishments We're Proud Of
- Gemini-powered try-ons that are genuinely photorealistic, not rough overlays
- A production-style monorepo with five cleanly separated service layers that mirrors real enterprise architecture
- A premium UI polished enough to demo to a retail buyer today
- A working voice + vision agent stack that proves the concept is deployable in a physical store environment
What We Learned
Generative image models are powerful, but their output quality is almost entirely a function of input quality. Building reliable prompt engineering and image capture pipelines around Gemini taught us as much about product design as it did about AI.
We also learned that designing for two users simultaneously — the shopper and the store operator — forces better product decisions than designing for one. The features that matter most are the ones that move a business metric (return rate, conversion, dwell time), not just the ones that feel impressive in a demo.
Business Model
Try On AI targets mid-market apparel retailers ($50M–$500M revenue) — large enough to feel the pain of high return rates, but without the resources to build custom AR infrastructure.
We see three monetization paths:
- SaaS licensing — white-labeled smart mirror software sold to retailers on a monthly per-location basis
- E-commerce API — per-try-on pricing for online retailers who want to embed the feature directly in their product pages
- Stylist agent affiliate revenue — commission on purchases driven by the recommendation engine
In under 24 hours, we built a working prototype that demonstrates the core value proposition and proves the technical feasibility of the full product. The next step is a pilot with a regional retailer to generate real return-rate data.
What's Next for Try On AI
- Real-time video try-on — feeding the live video pipeline into a frame-by-frame generative model so the outfit moves with you in real time
- Retail kiosk v1 — a hardened, touchscreen + voice-controlled deployment package for physical stores, targeting a paid pilot with a regional apparel chain
- Merchant catalog integrations — direct sync with Shopify and major retail APIs so the stylist agent recommends in-stock, purchasable items and drives measurable conversion
The core insight is simple: every percentage point reduction in return rate goes straight to the bottom line. Try On AI makes that possible without requiring retailers to rebuild their infrastructure.
Built With
- adal
- claude
- css
- getstream
- javascript
- python
- react
- vite
- voiceos
Log in or sign up for Devpost to join the conversation.