Nowadays, the term "computer-generated art" is becoming more and more ubiquitous. If we look at video games, for example, we can see many avenues where we use procedural generation in order to generate levels, npcs, and, in the case of No Man's Sky, entire games. On the other hand, "machine learning", another powerful set of techniques, is also coming into fray, using new types of networks called "Generative Adversarial Networks". Our goal, however, was to use these high-powered techniques to do something fun - create multitudes of coloring book pages for our siblings.
What it does
Our site, after pressing a button, uses the neural network we trained to generate a coloring book image, picking an image from our data set otherwise. You can also press a button to download a set of coloring book pages as a pdf. Lastly, you can press a button to have our Twilio bot send a coloring book image to you.
How we built it
First of all, we needed to gather and normalize a data set for the network to work off of. While we originally scraped Google Images to get our training data, the data we got was unpredictable and often outside of the network's scope. Instead, we chose another site, coloring-book.info, which had the data but was less easily accessible. After taking the data, we then had to transform it into a binary array which represents black and white pixels to limit the memory consumption of the training task. Moving forward from this, we worked in parallel to both train the data set and prepare the website, which would act as a wrapper for the final generator. The GAN implementation was built off of an existing architecture to better suit our data and our relatively low-powered machines.
Challenges we ran into
There were numerous challenges, most of which revolved around training our GAN efficiently:
- There is no set of nice coloring book data to pull from, so we had to find and set up one for ourselves
- As we only had our laptops to train the neural network, we could not use a higher-powered system and simply had to hope that it would produce meaningful output at some point.
- GANs tend to have high variability until they stabilize at a particular loss rate for the generator and discriminator halves, which can take thousands of training sessions. However, if we were to increase the learning rate to compensate, we would get stuck at a high loss point for both halves of the network. ## Accomplishments that we're proud of
- Getting past these computational limitations by elegantly tuning our training protocol
- Being able to re-format our data on the fly as we learned more about what types of data worked well in this system
- Managing to write an API that could be used by others to get our GAN's output. ## What we learned
- PyTorch, and its many modules
- Python Image Library
- Training GANs ## What's next for RoboSketch
- We took many shortcuts to improve the performance of training network at the cost of its final quality, which could be better balanced with more time to comb through more data.
- Instead of initializing each network with random values, each network could start by training separately with labelled data, where we would then transition to unsupervised learning using the existing loss functions.
- Give users an option to guess between Robosketch drawings and original dataset pictures
- Add post-processing to the output of generator as it will likely end up with some undesirable noise.
- Use a threshold to convert grayscale image to black and white after training as post-processing.
- We could provide our data set, which is unique in its purpose, to others to stimulate further advancements in unsupervised machine learning.
- We could modify our training function to include other factors, such as neighbor count.