While trying to decide what we would create for our first hackathon, we spent our first hour or two spitballing ideas that would be fun or we felt confident in making. We discussed some kind of app game involving angling satellites, games involving facial recognition, and websites that would solve some sort of problem in everyday life. Then, someone put forward the idea of a compliment generator which was later picked up and turned into a WPI Affirmations generator. WPI Affirmations is an Instagram page that students submit affirmations or hopes for the upcoming school days. The combination of the funny affirmations and the distinct style of the creator uses when creating the final image makes the page very popular. Every member of our team was already following the account and would regularly share affirmations we found funny, so when the idea of paying homage to this unique page by trying to have an AI emulate it arose we immediately jumped on it and got to work. Initially, we intended to have a website that you would input a phrase into and have the affirmation generated and returned off of, but abandoned parts of that idea for various reasons.
Tensor Flow Analysis
A team member had experience with TensorFlow and therefore knew that text generation was a topic with a lot of support only, some even coming directly from a tutorial created by the TensorFlow developers. Nonetheless, TensorFlow is a difficult and powerful tool and this required a lot of gathering and formatting of data to accomplish. Moreover, it needed a lot of data! The program analyzed chunks of affirmations 300 times each coming to cumulative 200,000 character predictions that were evaluated to be correct or incorrect and used to train the recursive neural network. Typically, a recursive neural network requires much more data than we had, and our lack led to a little bit of repetition at times as a result of overtraining on too small of a dataset. The affirmations bot seems to be very upset by the multi-factor authentication system and obsessed with trying not to steal traffic cones because it trained on those affirmations so often. Overall, the Tensorflow was relatively smooth yet time consuming because the neutral network requires a large amount of random access memory to train.
Instead of manually finding hundreds of images, we decided to web scrape a google search for the urls. We first tried to use an existing API service called serpapi. We made a free account on their website, but this only gave us access to 100 searches and returned the data in a somewhat inconvenient JSON format. While we did not end up using serpapi it helped us learn a lot about how searches are formatted and what information is stored in the url of a Google search. After searching through Github we found an existing project where someone manually web scraped using a google chrome driver. This is not as time efficient, but the code was able to be easily edited to serve our purposes. Once we had that, it was just a matter of integrating it with the database such that scraped urls were stored in the correct tables. We then ran the program to populate the databases. We populated the images table with the first 250 or so images from the search “anime funny png” and we populated the background table with results from searches like “background outdoors'' and “sailboats.”
We utilized CockroachDB to host a serverless cluster on the Google Cloud. Initially, we designed our system to use a single database containing 3 tables, affirmations, images, and connections. The affirmations table would store the text outputted by the tensor flow neural network; the images table would store the urls of images that had been webscraped; and the connections table would log connections between the two. We then realized that this design wouldn’t quite work because each affirmation created would have two images of different types. The initial solution was to add a boolean column to images, is_anime, but this still left the connections table with two foreign keys from the same table, which didn’t seem like a sustainable or logical solution. We ended up dropping the is_anime column and creating a new table called background containing the same columns as the images table.
As our team each worked on separate parts of the project, it became clear that we needed something to tie the image retrieval and affirmation generation pieces together. This began with an integration with the cloud based database. It was a challenge setting up the database access on each individual member of the team’s computers, but once we did we were able to incorporate the database access into our code. There were several steps to this incorporation. First an anime image and a background would be selected from the database and their ids would be saved. They would be selected based on how often they had been used, so every image would be used once before any image would be used a second time by the generator. Next, an affirmation text is generated and a new row is added to the affirmations table. These three ids are then used to create a row in the connections table for logging purposes. After that the image urls are retrieved from the database and passed into the image processing system.
Image processing was the easiest part of the puzzle, yet also the most specific. The biggest part of it was making two images combine through the pillow library, initially our approach was to use the paste function, however that did not allow for images with different alpha values. Alpha refers to the 4th parameter of pixels which is the opacity of the pixel. The next approach which was successful used the alpha_composite function, however, it required that we use images of the same size. Thus the paste function was first used to paste the secondary image onto a white background image making one large image that is the same size as the primary image allowing for the alpha_composite function to do its job. Earlier it was mentioned that these problems were caused by having differing alpha values in images, this was due to the idea of ours to try to reduce blank pixels in the secondary images which commonly had the white and gray checkerboard backgrounds. So image processing was done on the secondary images to make the background transparent, this was done by identifying pixels who had r,g, and b values that were all equal, and above a certain value between 0-255. In our case we chose 180 as we found the checkerboard backgrounds commonly used gray and white values above 180 rgb. Finally, the images needed to have the AI generated text added to it. The text color was assigned randomly and manipulated to emulate the vibrant text glow effect that WPI Affirmations typically has. The text was split into two sections in order to make a top caption and a bottom caption. A series of algorithms were used to compute the required font size and placement for these captions so as to keep the image in the center visible and let the text still be readable. With the text randomly generated, separated, colored, sized, and placed onto the image, the AI generated WPI Affirmations post was finally complete.
Overall, we accomplished and learned a lot through this hackathon and are very happy with the final result. This real-world project utilized libraries and tools we could actually see ourselves using rather than the more abstract stuff we generally do in class. Moreover, we strengthened our ability to learn from APIs, online documentation, and forums. Finally, we really enjoy the final program and just find the AI expectation of what a typical WPI student will affirm to be funny. It doesn’t work perfectly and that’s the best part because it creates exactly what we aimed for: a funny homage to a funny Instagram page.
Log in or sign up for Devpost to join the conversation.