Wouldn’t it be great if we could teach people how to be nice to each other on social media platforms, and then reward them for changing their behavior?
Bullying is a problem facing one in three teenagers, according to recent research from the US . The most common types of online harassment include “mean” comments, the spreading of online rumours and racial, disability and sexual remarks.
So we know what the problem is, what can we do about it?
What it does
Coruscant aims to tackle cyberbullying by providing an opt-in service for users which highlights any offensive language in their social media posts. Coruscant applies Machine Learning algorithms to rate the text in the user’s posts, issuing Offensiveness Scores across a number of categories:
- Identity Hate
Users who either reduce their Offensiveness Scores, or keep their Offensiveness Scores low over time, are rewarded with a badge which can be displayed on the Facebook profile.
By issuing award badges which the user can post on their timeline, Coruscant aims to improve social behaviour and encourage lasting change, making Facebook a safer, better environment for everyone.
How we built it
In order to demonstrate the concept, we created a Node.js web page and Python Flask app which simulates how Coruscant would receive Facebook posts for processing.
We then implemented the BERT Machine Learning model  using PyTorch. This model is what applies the classifications and returns the offensiveness scores used to educate users.
Once the data is returned to the Node.js web page, we calculate and then display the Offensiveness Scores as a percentage of how certain the model is that the post’s language is offensive.
Once complete, We trained the model using codified cyberbullying datasets which allowed us to identify the accuracy of the model.
Challenges we ran into
Finding a large enough dataset of codified offensive posts proved a challenge. We used a dataset which contained questions and then transformed them into statements which provided us with 110.000 sample statements.
Accomplishments that we're proud of
The model provides Offensiveness Scores on several classifications which allow the user to easily identify why a particular post was flagged as offensive.
The gamification concept provides positive reinforcement for the user, encouraging lasting change as they seek to extend their “winning streak”.
What we learned
It is possible to identify different types and levels of cyberbullying and extremist comments in social media using Machine Learning. We can then apply this to social media posts to help them identify their behavior. By rewarding positive behavior, we aim to encourage lasting change.
What's next for Coruscant
One of our first priorities would be to integrate the solution as part of Facebook.
We wish to extend the offensiveness categorisation to include sub-categories (such as racism and homophobia) to make it easier for users to understand why a post is offensive.
We plan to create another Machine Learning model which will generate suggestions for rewriting posts to reduce the offensive score.
We would then look to create a near real-time “offensive checker”, which will highlight offensive comments as they are written in a similar manner to how spell checkers work.