Inspiration

I wanted to experiment computer vision and machine learning applications in the mobile world and create an app leveraging those technologies to produce something cool and useful at the same time.

What it does

Ever seen a cool shirt and had trouble finding e.g matching shoes? This app solves that exact problem. Take a picture of a piece of clothing and the app will give you a suggestion on where to find a matching one.

It's based on a computer vision algorithm trained on a custom set of data. This version of the app is already able to suggest a matching pair of shoes when provided with a picture of t-shirt. The suggestions are derived from Zalando's product catalog. Further extending it and using larger datasets could allow more precise and more exciting applications!

Note

To see the algorithm in action, launch the webapp in HTTPS enabled site:

https://fabmatch.online

accept the self-signed SSL certificates. When the app opens up select "Live" and allow the app to access your video camera to see the live tracking.

How I built it & challenges

I wanted to write this section WHILE building the application. Hence, this is written in a kind of blog style.

Friday

Around 6pm: I started looking for solutions on how to train a computer vision algorithm to detect a piece of clothing. I have some familiarity with Constrained Local Model approach for feature detection. That works pretty well if you have the time to commit to it; the process includes annotating a set of sample images to match an outline model of what you are trying to recognise. I figured this would take too much time and I need to get something done quicker than that so I can determine if this project is even viable.

7pm I found a JS library that can detect features using Haar Cascades in an image or video stream. It works really well but I would have to train the haar classifier for different pieces of clothing my self; there aren't any readily available classifiers for that purpose I could find on the internet. With OpenCV it's fairly straightforward to train a haar classifier so I'm going to try that first. All I need is a large set of positive and negative image samples and I'm all set.

8pm Started exploring where I could get a large enough set of quality images for training. Zalando website has a few with white background. I reckon this kind of images are optimal for training a Haar Classifier. The official hackathon API at quick glance doesn't seem to provide enough suitable images and I'm definitely not going to download them by hand so I'm gonna have to find a way to automatically fetch the images. Let the hacking begin!

9pm I put together a small python script that fetches the page html with specific search queries. Using regex I can easily parse the product codes and point to a media url where I can download the images. Sweet! Moving right along.

10pm The downloaded images were mostly perfect frontal images of t-shirts and trousers with white background. Some of the downloaded images contained the model as well or some background noise. Have to manually remove the images that are no-good for training.

10:30pm Started training the haar cascade. This is going to take a long time unless I find some serious hardware physically or the cloud. Microsoft guys left already, I need to think of something. Meanwhile I'm going to focus on designing the UI!

Saturday

1am I need to tweak the parameters for cascade training a bit. The CC was overtrained with 16 stages. I'm dropping to 8 and letting it run trough the night while I sleep. I'm trying to find some Microsoft guys to get me some cloud time to run the training algorithm. Everyone's asleep...

2am Working with the application UI with TouchstoneJS which is a traditional React app boilerplate but with no documentation. Getting strange errors from router and a bit tired after such a long coding spree. Going to go to sleep soon.

11am Back at it! The cascade was trained with 8 stages while I was sleeping and produced a result. Need to test it with js-objectdetect. If it works well enough this project will become totally viable so I'm getting excited.

1pm The results are not very impressive. I think I need a better dataset. If I could find images of t-shirts in different backgrounds and without it would be perfect. In that case I would need at least 50 positive images and 500 negative images. Hmm... need to explore possibilities.

2pm So I went on and took a total of 250 pictures (positive and negative samples) of Junction hackers who volunteered. Every volunteer had a t-shirt on so this project will probably be best for finding matching trousers for a t-shirt. Now I need to process the images and choose positives and negatives. Downscaling can be done with python. I need to think about the cropping part.

3pm I created a nice set of scripts for preprocessing the images via ImageMagick. This includes downscaling and cropping sections. I want to be able to train my cascade a bit faster with better accuracy (I'm fairly confident at this point that my dataset is OK enough). I talked to the guys at DataCenter and they were generous enough to give me access to their Jelastic cloud. I set up a container with 16 cloudlets for a total of 6.4GHz CPU power. Now I'm building OpenCV on the Ubuntu container. make seems to correctly utilize the available resources with the -j flag on. Just have to wait for it to finish to test the training performance. For this purpose I have to build opencv with the flag WITH_TBB=ON to allow multithreading.

4:30pm Build complete. Let's test!

5pm Training the cascade requires too much RAM, my process is getting killed by the platform. I'm unable to find DataCenter guys here to give me some more RAM. I might go with Azure. Meanwhile the local training was completed after ~6hrs. And testing the classifier produced a RESULT for the first time! The algorithm can detect t-shirts with moderate success. Having short sleeves and open arms with a U-neck works the best. The only thing to do is to apply some tweaking and we can determine the color of the shirt, which will be our main criteria for finding a matching pair of trousers.

6pm Apparently Atom's image preview interprets image orientation wrong so the rotate command I added in my preprocessing mixes up the detection a bit (the image needs to be tilted to work properly). I'm going to train a new cascade on Azure and tweak the parameters a bit. Whatever the result I will have to go with it. There's not enough time.

7pm Wow, setting up Azure was really a breeze. I set up an Ubuntu F16S virtual machine with 16 cores and 32GB memory with the Azure pass subscription. Compiling OpenCV from source took ~3 minutes as opposed to ~30 minutes in the DC Jelastic. Obviosuly more hardware = more power, so Azure is more fitted for this purpose.

7:20pm Omg. The training completed in ~10 minutes as opposed to ~6 hours on a standard 2015 MacBook Pro. Really awesome!! Thanks Microsoft!

9pm Implemented object capture from HTML5 video and tested on an android phone (Huawei Honor 8). Detection works, but the algorithm requires some tweaking due to a large number of false positives. When the image is moving (or a person wearing t-shirt moves past the image) it gets crazy. That's a good sign.

11pm The 12 hour coding spree is starting to take a toll on me. I'm just trying to wrap everything so I get a demoable app that looks good enough. I'm switching UI frameworks to BlockJS, a very lightweight prototyping framework written by me, to introduce less overhead.

Sunday

1am Things are starting to look okay. I'm happy with how the UI/UX is starting to unfold. I've got most of the components planned and designed and half of them implemented. When I have integrated the Zalando API and bound data to elements I'm gonna go back to optimizing the classifier algorithm.

02:30am Starting to intergate Zalando API. Much to my dismay I found out that the API doesn't have a trousers category. I will have to match shirts with shoes instead.

05:30am Still at it. Can now determine the mean color of a t-shirt from an image. Should be enough for demo purposes if I can get a general color classifier done that works with the data returned from the Zalando API. Also I just realized I don't distinguish between men's and women's clothes. Need to think of something quickly!

7am I had to write a RGB to HSL color converter to determine the basic color from the mean value. The color classifier seems to work okay. Code quality is horrible at this point, but if it works it works! I've been coding for 20 hours straight. Phew. Still going strong...

8am Omg! It works. The app is now integrated with the APIs and classifiers and it gives exciting results. I would count this as success! It's not yet time to go to sleep however. Need to make sure that it works on Android and possibly create an .apk

10am After some preparation work for the demo, I published the webapp on http://fabmatch.online (Thx Radix for the domain!). I didn't have time to build it into an .apk but it should be fairly straightforward later on. It works as a Home-Screen app with the standard web view, but introducing a crosswalk build will offer more power and smoother experience across all devices. These days hybrid apps perform really well if the correct technologies are chosen.

Accomplishments that I'm proud of

I'm most proud of my custom trained haar cascade. I had no real experience with OpenCV before and I managed to come up with something that works.

What I learned

I learned to use microsoft Azure and Jelastic cloud. Training CV algorithms with custom datasets is no longer a mystery.

I also learned that people will appreciate your cool idea and be ready to help you (by providing images of themselves for CV training) without hesitation. Always use the resources available around you in creative ways.

What's next for Fabmatch

More data = better image classification. I have a feeling that with the technology I built this could actually be a viable product. And it would probably work pretty well if the CV algorithm is trained with enough data.

The business case for this application would be for partners to increase sales via recommendations to users. If the user finds real value from this app by finding clothes that they like more easily it would be wise for businesses to be part of the platform. They should provide APIs for their product catalog to be able to recommend products for the user.

Share this project:

Updates