Evaluation of Convolution LSTM for Sentiment Analysis

Grace Chen posted an update — Apr 25, 2024 06:46 PM EDT

Introduction: Consumer reviews on social media are a goldmine of user data, and heavily influence business and brand image. However, it can be difficult to filter and extract some form of meaningful, large-scale feedback from them without the help of deep learning models. Sentiment analysis (a classification task) helps solve this issue by classifying reviews into predefined categories: subjective/objective, positive/neutral/negative… This approach has previously been used to filter through movie reviews and even predict political leanings of Twitter users with much success; however current models are struggling to keep up with the exponential increase in the amount of data available to handle .

The paper we have chosen outperforms previously implemented models for sentiment analysis by blending two deep learning architectures and drawing upon the benefits of each. This hybrid approach dubbed Co-LSTM combines the precise local feature selection of a CNN with the effective sequential analysis of an LSTM. This allows for better scalability with big social data and makes for a more adaptable, non domain-specific model. This new flexible approach helps us take full advantage of current social big data when performing sentiment analysis.

Challenges: Arguably our biggest challenge moving forward is to increase the accuracy of our model. Currently, our model outputs suspiciously low loss values during both training and testing. Those values sometimes increase throughout training as well, and we find it suspicious that the testing loss is often significantly lower than the training loss despite the training loss typically being slightly higher than the testing loss. We suspect that pinpointing the cause of this issue will be especially difficult – our model uses a Tensorflow function to calculate binary cross-entropy loss, and thus our loss issue likely originates elsewhere in our model’s architecture rather than within an isolated loss function itself. Finding this precise location will likely require thorough trouble-shooting.

Insights: Data from the IMDB review dataset (our main dataset) has been fully preprocessed and embedded. A basic model with all the paper’s components is currently functional, can be trained and tested, and outputs accuracy and loss values into the terminal. However, it displays low accuracy and inconsistent loss values.

Plan: We will complete preprocessing other « goal » datasets (Airline Sentiment Dataset and Presidential Election Dataset) shortly. We will then start on preprocessing the « stretch » Anime Review dataset. We have finished implementing our basic model to reflect the model described in the paper, and we are currently working on ways to increase our training and testing accuracy.

We need to dedicate more time to condensing/fleshing out our model architecture so that our model is more stably trained. This will also be helpful when we change our model in the future to experiment with different model architectures. As discussed in the proposal, we will try replacing the LSTM with a Transformer to see how it impacts our training.

Log in or sign up for Devpost to join the conversation.