strikethru

Generate content warnings for webpages.
Filter for specific word sets.
Protect against custom words.
Filter documents for harmful languages.

Inspiration

It's no secret that the internet is far from a safe space. There's harmful language - whether it be used "jokingly" or maliciously - that can be triggering for individuals of all identities. And, unfortunately, many websites and social media platforms do not give users the ability to censor content for themselves. This is where strikethru comes in.

What it does

strikethru is a Chrome extension that uses HTML scraping to find harmful words on a website and hides them from the user. Think of it like a content warning maker for the internet! The user can pick from different categories of potential trigger words and even add their own.

strikethru's website also has a file upload tool that can filter an uploaded .txt file and output the same document with trigger words edited with asterisks. They can also choose to hide entire sentences that qualify sentiments in text, hiding entire sentences that are recognized as hate speech.

How we built it

The front-end is a web app built using React/Redux to make a repsonsive web-app utilizing GoogleOAuth and HTML/SCSS/Javascript. It also utilizes Google OAuth to save user preferences and connect with the Chrome extension.

This API was built with two core functions in mind: user storage and text processing. To accomplish the former, we used Google's Cloud Firestore database to store collections of registered users. We implemented basic CRUD functionality for handling users and their stored preferences. When it comes to text processing, we offer users multiple different options for analysis, of which those preferences are stored in the Firestore's user collection. We chose to allow users to process their text by either word or sentence, each method lending towards a different overall goal. Our text method consisted of compiling a library of known slurs and/or triggers and using it to process texts for potential outright triggers. Our sentence method utilizes the hatesonar library in python as an NLP hate-speech classifier to make predictions sentence by sentence.

Challenges we ran into

We really wanted to be able to implement our own model for hate-speech classification, however we found that in the time given we could not reach a level of accuracy that surpassed too far off random. We used data from kaggle and trained two different Naive Bayes models, one with GaussianNB from Python's sklearn and one with the Naive Bayes Classifier from Python's textblob. In both cases we found issues with quality of data and predictive power, and thus we chose to use Python's hatesonar library. While there are also short comings in this model, we found that this was the best alternative to use for this project, as it was built in so little time.

Accomplishments that we're proud of

We're proud of our idea that aims to make the internet a safer place for everyone. We also like our design and the practicality of our Chrome extension!

What we learned

We learned a lot about the prevalence of hate speech on the internet over the course of this project, as well as the ways that different groups are targeted by this harassment. Technically, we learned about methods of classification, NLP analysis, how to scrape and edit a pages HTML to produce the word blurs and new types of design: animated gradients!

What's next for strikethru

One next step would be to create a customized model for predicting our own criteria of 'hate speech'. Two of our members are actually in the first few weeks of COSC 72: Advanced Computational Lingustics, and we hope by the end of the term to be able to build our our own model with our own set of data and curated insights. We would also plan to work on increasing the level of customization and analytics for users, perhaps allowing more interactivity with data generated from their text analysis and widening our database for different types of trigger words to offer more word sets.

Built With

Submitted to

HackDartmouth VIrtual
- Winner Airpod Pros

Created by

I built our API, handling NLP analysis, file upload transmission and connection to the Google Cloud Firestore database.

Catherine Parnell
Dartmouth '22 studying Computer Science. Aspiring developer.
I made the designs for the project in Figma and Adobe Illustrator!

Wylie Kasai
he/him. dartmouth '22 studying computer science, digital/fine arts, and human-centered design!
I built the web app, implementing the designs and animated backgrounds as well as the document filter flow, word set/custom word input flow, and Google OAuth!

Jordan Sanz
Dartmouth '22 studying Computer Science and Quantitative Social Science.
I built the Chrome extension, which included webpage parsing, Google OAuth, and updating/using the user's preferences from our API.

Sathvika Korandla
Dartmouth '22 studying Computer Science; interested in Web Dev, Machine Learning, and Human-Computer Interaction!

Updates

Wylie Kasai started this project — Apr 18, 2021 11:57 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.