What it does

A user enters in the URL of any image on the web. Using the Clarifai API, we get a list of tags that describe that image. Using these tags, we run a Google search to obtain a few similar images. We also get text from a number of real BuzzFeed articles and Several Wikipedia pages relating to the image that was entered. Together, this text form a large corpus that used to train a custom-built Tri-gram model, which generates a brand new article (which is intended to combine the topic of the image (Wikipedia articles) and the writing style of BuzzFeed. We also generate a new title using the Tri-gram model and a few real BuzzFeed article titles. The found images, along with the generated title and articles are passed along to our front end which displays in it a BuzzFeed-ish format.

How We built it

This project was divided into three main components: The web page front end, the Server, and the Tri-gram model. Each member of the team worked on a component until they were ready to be merged together to form the full product.

For APIs we used Clarifai, Buzzfeed, Google search, and Wikipedia (by web-scraping).

Challenges We ran into

Many challenges we had were API-related.

  • It turned out that Clarifai's API does not have similar image search functionality at this time, so we had to use a combination of the tagging API and Google search to get the similar images.
  • We wanted to search BuzzFeed's articles using our image tags so we could get targeted training data for the Tri-gram model, but this was proving very difficult as the API does not have a search functionality. To get around this we used a combination of random BuzzFeed articles and Wikipedia results for our tags, hoping to achieve a similar result.
  • By working on components individually, we were able to fully utilize our team, but it was a challenge to coordinate our portions so we could merge them easily when the time came.

-Writing a Tri-gram model to generate something resembling language from scratch overnight is a challenging task, to say the lease.

Accomplishments that We're proud of

  • Using so many different APIs to get the data we needed for this to work at all.
  • Overall, we were able to split up the work very effectively
  • Nils's first Hackathon!

What we learned

  • Nils learned how to use git
  • Akshay learned about N-grams from Nils
  • Aditya gained much experience in web development

What's next for NBuzzGram

Hopefully some performance improvements!

Share this project: