NBuzzGram

Inspiration

What it does

A user enters in the URL of any image on the web. Using the Clarifai API, we get a list of tags that describe that image. Using these tags, we run a Google search to obtain a few similar images. We also get text from a number of real BuzzFeed articles and Several Wikipedia pages relating to the image that was entered. Together, this text form a large corpus that used to train a custom-built Tri-gram model, which generates a brand new article (which is intended to combine the topic of the image (Wikipedia articles) and the writing style of BuzzFeed. We also generate a new title using the Tri-gram model and a few real BuzzFeed article titles. The found images, along with the generated title and articles are passed along to our front end which displays in it a BuzzFeed-ish format.

How We built it

This project was divided into three main components: The web page front end, the Server, and the Tri-gram model. Each member of the team worked on a component until they were ready to be merged together to form the full product.

For APIs we used Clarifai, Buzzfeed, Google search, and Wikipedia (by web-scraping).

Challenges We ran into

Many challenges we had were API-related.

It turned out that Clarifai's API does not have similar image search functionality at this time, so we had to use a combination of the tagging API and Google search to get the similar images.
We wanted to search BuzzFeed's articles using our image tags so we could get targeted training data for the Tri-gram model, but this was proving very difficult as the API does not have a search functionality. To get around this we used a combination of random BuzzFeed articles and Wikipedia results for our tags, hoping to achieve a similar result.
By working on components individually, we were able to fully utilize our team, but it was a challenge to coordinate our portions so we could merge them easily when the time came.

-Writing a Tri-gram model to generate something resembling language from scratch overnight is a challenging task, to say the lease.

Accomplishments that We're proud of

Using so many different APIs to get the data we needed for this to work at all.
Overall, we were able to split up the work very effectively
Nils's first Hackathon!

What we learned

Nils learned how to use git
Akshay learned about N-grams from Nils
Aditya gained much experience in web development

What's next for NBuzzGram

Hopefully some performance improvements!

Built With

Submitted to

Fall 2015 hackNY Student Hackathon

Created by

I created the server component using Node.js. This component took in the image URL and communicated with all of the various API to get the assets needed. It passed data to the Trigram model to generate the new strings, and then passed that information to the web app front end for display. I also worked on the styling/formatting of the front-end.

deleted deleted
I worked on the frontend for the input and output pages for building the pages and articles. I also worked on the generation of the article using the output from our server to get output of the trigram model.

Private user
I wrote the code to generate the trigram model based on a text corpus that Akshay scraped from Buzzfeed and Wikipedia. My code used the trigram models to automatically generate the text on the website.

Nils H

Updates

deleted deleted started this project — Sep 27, 2015 09:32 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.