Inspiration
In today’s world where social media has such a significant impact in our lives, there is so much of information expressed by users online. From trivial things like the weather to controversial topics like politics and war- users rely on social media to express their views on any topic they desire. It is fascinating to note that different topics can have different kinds of reactions from different users- some may feel excited about it, others may get offended by it whereas some may be completely neutral or unaware about the topic altogether. This is why it is important to be able to extract the underlying sentiment behind such data posted on the internet today.
Our project categorizes tweets into three categories—positive, negative, and neutral—based on the emotions expressed in them. We specifically look into the applicability of a two-step classifier and negation detection in the context of Twitter Sentiment analysis. In the age of big data, where the preponderance of electronic communication is a significant bottleneck, an effective emotion analyzer is considered to be a need. We created a thorough set of pre-processing processes that prepare the tweets for Natural Language Processing methods using a variety of publicly accessible web datasets.
What it does
Data is extracted from Twitter and the training model identifies the underlying sentiment behind these tweets. This is then displayed to the users depending on what category of tweets they want to see and also what underlying sentiments. The workflow is as follows:
• Allows new users to register by creating an account. Existing users can log in by providing their username and password.
• The user can choose to view tweets from a variety of topics such as soccer, food, Hollywood, etc. Additionally, the user can search for a topic which is of his interest. The user can also select from the drop down if they wish to see positive, negative or all sentiments pertaining to the selected topic.
• The latest tweets of the topic selected will be displayed with their details.
How we built it
The following technologies have been used and these individual components have been integrated together to give us the desired outcome:
• AWS EC2, Lambda and API Gateway
• MySQL Database
• Django framework with HTML and CSS for the frontend
• Twilio for sending text message of preferred tweets to users on their mobile devices. Note that your mobile number needs to be registered on Twilio for this functionality to work.
•. Registered smartfeed.tech domain on domain.com
Long short term memory model in Machine Learning is used. LSTM is a species of neural network which takes into to account the prior inputs in time taken into to account on the output. Data taken from Sentiment 140 dataset hosted on Kaggle for training. The activations used in the model are
- Sigmoid function: φ(z) = 1 / (1 + e^(−z))
- Tanh function: φ(z) = (e^z − e^(−z) )/(e^z + e^(−z))
Preprocessing and cleaning done on data:
1) Data sourcing of top 10 topics like science and sports used to gather 1M tweets.
2) convert to lower case and remove punctuations, remove special characters
3) Tokenize to mapping dictionary
5) normalize text length using padding and truncating
Website Link
Challenges we ran into
• Training the backend model
• Since our team mostly comprises of data science/ML developers and enthusiasts, we struggled a bit with the front-end side of things.
Accomplishments that we're proud of
• Successfully scraped Twitter’s data and can display the tweets based on a specified category.
• The model is efficiently able to segregate positive, negative and neutral tweets from a given topic.
What we learned
• How AWS technologies are integrated together effectively
• The ease of working with the Django Framework
• How to train a model to effectively extract the required data
What's next for SmartFeed
We aim to scrape data from other social networking sites as well such as Reddit. This will scale up our project and allow us to cover more data for better results.
Log in or sign up for Devpost to join the conversation.