How many times have you got excited for an online order, only to have it disappoint you once you've tried it on? After that, you have to go through a lengthy process of returns, which can take ages. In recent months there have been large scale changes in how retailers sell and distribute their products. Countless companies have decided to take their business online and as a result consumers have been flooded by new opportunities to purchase products via web marketplace. While this spike in accessibility may seem to be convenient, often times it is challenging to envision how an article of clothing may suit you. These factors have inspired the team to present MyWardrobe.
What it does
MyWardrobe is website which allows someone to envision how they would like in an article of clothing. The website first lets you select from possible clothing you might like to purchase. After selecting those items you can upload a picture of yourself and the website will show you how you would look in those clothes!
How we built it
MyWardrobe was built using pretrained neural networks built in pytorch and Tensorflow. The first of the networks performs a semantic segmentation on a user provided image which disentangles the clothing and person. We finetuned this network for individual images since it was originally trained on a dataset with multiple people and people possibly facing away from the camera. The second network uses the user provided image to get pose estimation. We finally pass all this information to the final network takes in all of this information and generates a deformation field which maps and blends the user and the clothing to produce a stylized version of the user.
The front-end is developed in Jeavascript, React.JS and Redux because they allows us to dynamically change components as new products are added to the catalogue. Additionally, we can store state between pages. Finally, it allows for fast prototyping. This allowed us to store a cart of items between the homepage and style transfer page, as well as updating new style transfer photos without needing to reload the page.
The back-end is developed in Python and Flask. It's simple and fast for prototyping. Additionally, since the inference models are developed in Python, this allows us to use the models directly within the back-end. We developed different routing paths for different functions of the application. Saving pictures, querying for pictures and querying for a style transfer are all performed on the server-side application.
Challenges we ran into
Initially, the team attempted to synthesis style using a GAN that leveraged a segmentation map to disentangle the target space. The goal of this model was to generate style for corresponding articles of clothing beyond tops. We had found that the GAN was ill-specified after digging through endless uncommented config files and forced selection. Afterwards we found the section which invalidated the part of the paper we needed in order to accomplish our task and had to restart after a hard day of work.
Accomplishments that we're proud of
Our team is incredibly proud of all accomplishments from frontend to backend. On the backend, the processing pipeline integrated 3 models across different frameworks /platforms that had to be patched together seamlessly for us to get reasonable results. Despite not having time to leverage official open-pose, we managed to optimize a lighter weight version of pose estimation which still came through!
What we learned
The team had a lot of new experiences today! The application side had a first run at both web frontend using react, and a backend using flask. New experiences for the deep learning side of things were with respect to in-depth integration of many tools whose interfaces were all complex!
P.S. Never trust a GAN :'(
What's next for MyWardrobe
MyWardrobe works well but there were a few items we weren't able to finish in time. In order to streamline the experience we would like to add backend GPU support to our networks which would reduce the waiting time for the user. We noted that v-necks we're not cropped at the v due to the material in the back which forced the model to produce an overlay rather than a full blend. This could be mitigated by sampling from neck skin in the region, enforcing similar input style, or getting better crops in these regions. One final feature we didn't add on the front end was the ability to take pictures with a computer's webcam. Adding the ability to take pictures on the spot would further streamline the process.