Inspiration

Looking for something to buy on amazon? Want to go somewhere but want to have a look at the yelp reviews first? Well, fake reviews or reviews where the stars don't match with the content are really annoying.

What it does

We provide a basic web-api, which is listening on new user-reviews (platform-independent) and a given star-ranking. Then we process the data and respond how qualitative this review is and check whether the star-rating matches with the given review-text.

How we built it

We used IBM bluemix to create the dataflow and to provide the api for the user. The model is hosted on a server and is queried by bluemix. We also installed a MongoDB on a server to query the amazon review data (sorry, we couldn't use the bluemix db, as it was limited to 64 mb per import and we had about 20gb of data to import). We use the model to predict the "usefulness" of the given review, based on the structure of the sentences in connection with the upvote-scores of the reviews in the dataset. We also use ibm-alchemy api to sense the emotion of the review, comparing it with the rating.

Challenges I ran into

First of all we really wanted to use the ibm watson api a lot, but sadly it was strictly limited to about 1000 queries, making it impossible to process our large datasets. We also struggled finding big datasets to train the model. We contacted Julian McAuley who quickly gave as access to amazon review-datasets he hosted. By the way, the citation:

Image-based recommendations on styles and substitutes J. McAuley, C. Targett, J. Shi, A. van den Hengel SIGIR, 2015

Inferring networks of substitutable and complementary products J. McAuley, R. Pandey, J. Leskovec Knowledge Discovery and Data Mining, 2015

Accomplishments that we are proud of

Getting everything work together wasn't as easy as we thought. But we learned quite a lot this way, especially because we used a few different systems (bluemix, mongodb, python, scikit, ...)

What's next for reviewAnalyzer

Of course more enhancements on quality of predictions and fake-recognition as a future-feature

Share this project:

Updates