Cyberbullying Detector

Inspiration

We came across a kaggle dataset with labeled cyberbullying tweets. This is a very relevant topic, given that more people, especially children are interacting online. Our project is an end to end solution that can help parents, educators, and other online moderators on identifying cyberbullying speech.

What it does

A chrome extension scans a webpage and highlights phrases that can be considered cyberbullying. It classifies these phrases using a pretrained model running on a python server.

How we built it

We created a chrome extension using just javascript, and Python fastAPI to host the models in the backend. The NLP model was trained based on some of the existing notebooks on the dataset. The current model being used by the chrome extension is a random forest model.

Challenges we ran into

Since this was our first time building a NLP model, there was a significant learning curve. We learned about the different approaches to text classification. We tried attempting to build a model based on BERT, but have been unsuccessful so far as of the time of writing. We spent a significant time on seeing how we can get an example text classifier using NeMo to run, but ended up stuck on some errors.

Accomplishments that we're proud of

This was our first time building a full nlp model on a larger dataset, as well as creating a chrome extension.

What we learned

How to train a text classifier and create a chrome extension.

What's next for Cyberbullying Detector

In the future, this can be used to moderate online forums, game chats, and social media. Educators can train models specific to age group for even more accurate classification. The chrome extension can be optimized for specific websites, such as on Instagram. For concerned parents, these can be used on their children's text chats. For educators, this functionality can be a zoom extension to moderate zoom chat or any other education software.