All of our members have been using the internet for a long time, experiencing various forms of hate speech, with much of it being targeted against Asians, Pacific-Islanders, and Asian Americans. From the prompts given—and our personal experiences, we thought we could help combat anti-asian hate speech through a crowd sourced machine learning model similar to Google ReCaptcha.
What it does
Our project uses a python server utilizing an NLP machine learning model in order to detect hate speech. We use an SQLite database in order to store the model used for training. We grabbed an existing dataset to train the model initially. We then developed a Google Chrome extension, which allows the user to submit hate speech they happen to come across on the internet by highlighting the text, right clicking, and then selecting the report feature. This adds the input to our database, by connecting to the Python server. Currently, the model retrains after every 10 new inputs. We then developed a webscraper and a website in order to display how efficient the model is at detecting hate speech.
How we built it
Challenges we ran into
Accomplishments that we're proud of
Some of the accomplishments that we are proud of include getting the NLP model to work, getting the NLP model to read data from the webscraper and apply a hate score to it and then push it to the website. Getting the server to run properly (up and running, updating the database from the chrome extension inputs, working chrome extension).
What we learned
What's next for "Stop the Hate"
Cleaning up the website, handling data poisoning, adding a route to manually clean the database, multithreading long NLP operation to prevent stalling, and making a bot that automatically reports hate speech on social media website.
Log in or sign up for Devpost to join the conversation.