Diversify

black hair filter (w/mustache!)

Inspiration

Neural networks are powerful computer vision techniques which can map one image to another. However, they are so effective that a new problem is rising; models are sensitive to image intensity heterogeneity. One such example are computer vision systems which fail to operate over images of people of color, women, or those with non-traditional hair colors.

The issue of neural network failure on these images lies in the dataset, as such models only will learn what they are given. That is, the lack of diversity in machine learning datasets leads to neural networks that only make decisions based on that data. If a neural network trains on a dataset in which all minorities are labeled as criminals, for example, then the neural network will overfit to this data and learn this mapping. This is obviously undesirable and has heavy ethical consequences.

StarGAN is a convolutional neural network model which permits style transfer from one set of categories to another set of categories. Its ability to change a person's face, hair, age, and skin color can be a first step forward towards improving diversity in already existent datasets. In tandem, we hope that data houses will strive to collect data which is more properly representative of the diversity we have on this planet.

What it does

To demonstrate our first steps to improving diversity in human face datasets, we apply StarGAN to a live webcam feed and encourage users to toggle their visual appearance, including hair color, gender, and age. We hope to use the wisdom of the crowd to help identify where and how our model fails to accurately apply style transfer to images. In this way, we can converge to a solution where StarGAN could transform a dataset limited in diversity to one which encourages good performance in a variety of scenarios.

How we built it

Frontend: HTML, Javascript, Bootstrap
Backend: Python server running StarGAN on GPU. Locate face with Haar Cascade, rough edge segmentation with blurred Canny operator, StarGAN transformation with the PyTorch framework.

Challenges I ran into

Converting academic code to functions usable in real-time
Maintaining compatibility across multiple operating systems
Dealing with browser limitations on webcam streaming.
Use of computer vision filters for traditional edge detection as preprocessing
Dealing with noise artifacts in webcam capture

Accomplishments that I'm proud of

Combining segmentation and adversarial transformation

What I learned

How to use PyTorch!
That OpenCV has great built-in filters like the Canny Operator
Webcams are noisy

What's next for Style Transfer Fashion Model

Improved preprocessing s.t. the StarGAN is crisper
Continued training to enhance performance
Research into efficacy of diversifying datasets
Allow users to submit screenshots and written text to identify areas of improvement in this method

References

Choi, Yunjey, et al. "Stargan: Unified generative adversarial networks for multi-domain image-to-image translation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.

Source of initial StarGAN implementation: https://github.com/yunjey/stargan/