Inspiration

Natural language processing has been my field of interest, and I wanted to engage in an introductory project that allows me to learn the basics of machine learning. That's when I came across this Kaggle competition and thought I could use the data and concept of it: https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews

What it does

For a given movie review, the web app predicts the sentiment of the review on a scale of 0 to 4: 0: Negative, 1: Somewhat Negative, 2: Neutral, 3: Somewhat Positive, 4: Positive

How I built it

The training data was obtained from Kaggle and I chose the multinomial Naive Bayes model from scikit-learn to create a classifier. For preprocessing, Natural Language Toolkit was used to implement tokenization, lemmatization, and other techniques to improve the accuracy of the classifier.

Challenges I ran into

As a first-timer in natural language processing, the first challenge was to convert sentences into variables that can be fed into a classifier. However, my mentor directed me to various online resources over the course of the project and I ended up learning a lot of preprocessing techniques to organize training data in an algorithm interpretable way. Another challenge was the limited RAM capacity of Google colab, but I was able to circumvent this by using numpy array instead of pandas DataFrame for data storage whenever possible.

Accomplishments that I'm proud of

This is the first machine learning/web development project I've worked on and I'm really proud of developing something that can actually function on one's computer and make a reasonable prediction.

What I learned

Natural language processing techniques (tokenization, lemmatization, stemming, bag of words representation, etc.) API of scikit-learn & nltk The effectiveness of different Naive Bayes models (Gaussian, Bernoulli, Multinomial, Complement, etc.) Basics of web development

What's next for Movie Review Sentiment Predictor

I'd like to try out different classification models such as neural networks to further improve the accuracy. Moreover, creating an app with a more sophisticated UI by learning HTML/CSS is also a task want to work on.

Built With

Share this project:

Updates