What it does

Dashboard for machine learning-based detection of cyber bully activity on Twitter

How we built it

Examined approaches to dialogue representation in the context of cyber bullying detection:

  • bag of words n-gram, skip-gram
  • word2vec + emoji2vec -> implemented earlier work to extend glove & jointly learn robust emoji + word embeddings
  • Classification with svm l1/l2 regularization
  • Best results achieved with ensemble of logistic regression, svm, random forest classifier

Promising results:

  • Classification accuracy: 78%
  • Precision: 0.91
  • Recall: 0.86
  • F1-score: 0.80

Built proof of concept dashboard to present live overview of cyberbullying across twitter.

  • Topic modeling,
  • NRC Word-Emotion

Challenges we ran into

Had trouble with designing the dashboard

Accomplishments that we're proud of

What we learned

What's next for Cyber Bully Project

  • Improve model
  • Dialogue representation
  • social network graph analysis
  • Improve dashboard
Share this project: