Plotomatic

Cat. Captioned.

Inspiration

Our team wanted to make something that could make stories, as well as involve technology that we were new to.

What it does

Plotomatic generates funny mad-libs style image captions for images.

How we built it

We split up into two separate teams of two. Half of us worked on the front-end and the image analysis, while the other half worked on taking the keywords that we generated from the images and forming sentences from it. With this project, we used an iterative design process. We determined what were the individual components of each section of the project, then attempted a solution. If the solution did not work, we tried again until it succeeded. The website was built on a virtual machine on the Google Cloud Platform. The website was built on Bootstrap/HTML with client-side JavaScript handling the image transmission. The website then sent the image to our python backend server. This server took the image and used Microsoft Azure to generate keywords that describe the image. Those keywords were returned in JSON, and then parsed and processed; after which a few random keywords were picked and then inserted into random sentence(s) to generate the image caption, which was sent back to the client-side JavaScript program to display back to the user.

Challenges we ran into

Initially, we were trying to use the Natural Language Toolkit (NLTK) in order to generate stories. The sentences it generated were simplistic and not very interesting. Instead, we switched to mad-libs style generation because it allows us to add humor that would be much harder for a machine to learn.

Accomplishments that we're proud of

Building a full-stack webapp!

What we learned

That APIs are really useful for building something efficiently, and how machine learning works when it comes to text and language.

What's next for Plotomatic

Using a recurrent neural network to generate predictive text based on a description of the uploaded image.

Built With

Submitted to

Hack the North 2018

Created by

I went full stack and I'm the reason you get the same caption every time :)))

I mostly made the front end and did server stuff too

Will Harris
UW SE 2024 :)
I worked on the text output for the images that would get uploaded to the website, making use of Microsoft Azure, Postman, and Google Colaboratory. This was also my first time using Python, and my first hackathon! It was an enjoyable experience.

Simran Thind
uWaterloo software engineering, avid proponent of sleeping in
I worked on the back-end of this project. I focused on establishing our web server on the Google Cloud platform, and writing the python code that will receive the image data from the client, get the analysis from Microsoft Azure's computer vision API, and return it back to our caption line constructor. I also helped the team troubleshoot various issues that occurred during all facets of development.

Adam Kalman
Carol Xu
University of Waterloo SE Class of 2025!