Being well informed citizens of the interwebs, we often end up browsing sites that feature controversial topics. One of which is Reddit. Reddit is known to support freedom of expression, and therefore hosts a wide array of controversial posts and differing comments of users. Consequently, it is frequently a hotspot for heated 'debates' and vulgar language.

Drawing inspiration from a mere Chrome extension tutorial, we wanted to create something to improve our own browsing experience, aiming to transform the platform so that we may continue to browse and be an informed citizen without being exposed to, distracted by, and influenced by the often unproductive comments.

What it does

True to its name, Censsit parses Reddit comments using Javascript based on an array of text deemed profane, and currently censors the entire comment by replacing it with a preset statement.

In addition, it features sentiment analysis utilizing Google's Natural Language Processing API, allowing for dynamic filtering of comments based on comment's intent, whether positive or negative.

Finally, we included a joke feature, that ironically turns all un-censored comments into profanity of the user's choice; a reference and throwback to a prank an older sibling might play on their younger counterparts.

How we built it

Originally following a Chrome extension tutorial for the bare minimal template, we made progress drawing further inspiration from it, improving upon its functionalities, tackling issues one by one so our product could perform the task at hand. With initial manifest.json files, connecting it to an HTML and CSS file, we experimented with at first trying to parse Reddit comments and testing overlays with CSS to visually understand if the class ID checks were working correctly. We were able to finally black out all comments based on class IDs. Eventually, we were able to implement specific vulgar-word checks so it only censors and replaces specific text comments that contained vulgarity, rather than ALL comments. The final implementation refined the filter to split comments with regex to also check if profane words were being purposely varied with common punctuation marks such as the period (.), comma (,), question mark(?) and whitespace.

Challenges we ran into

  • Understanding how to even build a Chrome extension
  • Understanding how to implement JQuery
  • Parsing Reddit comments from HTML within their own respective classes
  • Reading textfile of profane text into an array for efficiency
  • Preventing the comment box from being censored due to Reddit having it identified as "class ...", which contains a certain expletive.

Accomplishments that we're proud of

  • Successfully parsing comments and injecting CSS into the webpage
  • Creating an easy way to have a joke option as a relic of our progress in the project
  • Successfully implementing text replacement for profane comments
  • Successfully whitelisting the comment box from being filtered
  • Successfully splitting comments to check for profane variants
  • Successfully implementing Google Cloud Natural Language Processing (NLP) API to use with sentiment analysis
  • Having a working product in the end

What we learned

  • Chrome extensions are not as easy as its cracked up to be
  • There is no in-built function to read and parse files into an array
  • On the same note, Node.js using require() and FileReader API, sketch.js all were unsuccessful attempts into implementing file parsing for the profane words

What's next for Censsit

We plan to be able to expand this to machine learning so that we can implement dynamic censorship to a wider variety of vulgar, as we know, language evolves quickly and people adapt to avoid language filters.

Share this project: