We saw what GAN (generative adversarial networks) were capable of. We wanted to push this technology to its breaking point with an insane dataset.
What it does
It tries to learn to create concept images given a project's sales pitch as input. A difficult task, we know. In fact, far too difficult to do without a great deal of general knowledge, not something we could work into the model. It tries its best and provides insight into the difficulties of general intelligence and visual processing.
How we built it
We used the text-to-image from the 2016 paper (Generative Adversarial Text to Image Synthesis, Reed et al.) and used an implementation made by github user "zsdonghao." We forked the code and made our own version, optimized for our data and capable of taking live user input for quick testing.
Challenges we ran into
The task we gave the model was pretty much impossible, so it just creates visual noise with little relevance to the input images or text.
Accomplishments that we're proud of
Getting a working GAN on a local machine and gcloud GPU instances and getting some really cool images.
What we learned
The importance of good data and the limits of machine learning. We also learned that seemingly trivial steps, like installing GPU drivers, can take over a fourth of your development time.
What's next for Text to Image: Kickstarter Concept Creator
We hope to refine the dataset to something more tractable and get some more useful output. The author of the github mentioned but did not implement using stacked GAN to increase resolution, allowing for full size images to be created, instead of tiny thumbnails.