It is important to evaluate feedback from customer of product, in order to improve it. But there are millions of products and more than billions of reviews. So for a conventional computer it is not possible to process or store the data. Hence, in real world use case, these kinds of project are built using distributed framework like Hadoop/MapReduce.

What it does

The project calculates overall positive, negative and neutral sentiment of product by using tokenism concept.

How I built it

We configured Hadoop in pseudo distributed mode. Setup eclipse as development environment. MapReduce is used as data processing framework.

Challenges I ran into

While configuring Hadoop and setting up development environment.

Accomplishments that I'm proud of

Successful execution of logic

What I learned

Hadoop Configuration. Hadoop Daemons. Distributed processing and storage MapReduce Distributed Hash table

What's next for SentmentAnalysis-AmazonReviews

We are using open source dataset of reviews, we can create a web crawler for this project, in order to scrape data from real products from sites like amazon, craiglist, etc. and run sentiment analysis batch job on it.

Built With

