It is important to evaluate feedback from customer of product, in order to improve it. But there are millions of products and more than billions of reviews. So for a conventional computer it is not possible to process or store the data. Hence, in real world use case, these kinds of project are built using distributed framework like Hadoop/MapReduce.

What it does

The project calculates overall positive, negative and neutral sentiment of product by using tokenism concept.

How I built it

We configured Hadoop in pseudo distributed mode. Setup eclipse as development environment. MapReduce is used as data processing framework.

Challenges I ran into

While configuring Hadoop and setting up development environment.

Accomplishments that I'm proud of

Successful execution of logic

What I learned

Hadoop Configuration. Hadoop Daemons. Distributed processing and storage MapReduce Distributed Hash table

What's next for SentmentAnalysis-AmazonReviews

We are using open source dataset of reviews, we can create a web crawler for this project, in order to scrape data from real products from sites like amazon, craiglist, etc. and run sentiment analysis batch job on it.

Built With

Share this project: