Anti-Hate Speech Google Chrome Plugin

What does it do?

This Google Chrome extension will hide hate speech and offensive language from Tweets.

What is our inspiration?

A certain high-ranking official has recently been using Twitter as a platform to spread controversial news and hate. Fortunately, this person's account is now deactivated. However, that is just a single individual. There are still hundreds of thousands of Tweets that spread the same hatred. This plugin aims to prevent those hateful Tweets from reaching the user.

How did we build it?

We first created a NLP model using spaCy and sklearn in Python and trained it on this dataset which includes a set of 20,000+ Tweets that were each classified as hate speech, offensive language, or neither. Then, we created a REST API using Flask so that the Google Chrome plugin is able to communicate with our NLP model. Lastly, we created the Google Chrome extension with JavaScript. We parsed the Twitter webpage for text-only versions of visible Tweets and sent it to our NLP model to be classified. If the Tweet is classified as hate speech, it is hidden from the user. Additionally, we incorporated a list of vulgar words with this Google Chrome plugin; if the tweet contains a vulgar word, it is censored.

What are some challenges we ran into?

The datasets that were used for Tweet classification and offensive language were both filled with unusual data that had to be processed out. For example, emojis were not encoded properly in the dataset, so that had to be processed out.

What are some accomplishments we're proud of?

We were able to fully implement the project that we had initially envisioned. Also, when testing our NLP model, it was able to attain 89% accuracy on new Tweets it has not yet seen before. This accuracy is great considering we did not have time to properly fine-tune the model.

What's next for the Anti-Hate Speech Plugin?

The next step would be extending this plugin to other social media websites such as Facebook and Instagram. Collecting more data points for a more accurate model and fine-tuning the model in general would also be highly beneficial.