Inspiration

  • We’re all avid reddit users and we’ve seen how much reddit as a platform can encourage people to act offline

  • Users attacking other users is a online social problem that is still ongoing today

  • Trolling, bullying, harassing, and demeaning online have demonstrable harm

  • What do we want to achieve?

  • Create an application that measures how toxic/harmful one redditor can be towards others by analyzing reddit comments and subreddit posts

Tech Stack

  • React.js - Display data
  • CockroachDB
  • Node.js, Express - Store data and bring to front end in a scalable way
  • PushShift API (Extracts subreddits, submissions, comments ...etc from Reddit)
  • Python packages such as Tensorflow, RoBerta
  • Visualization tools such as matplotlib

Challenges we ran into

  • We had difficults configuring some of the libraries (ex.Python package dependencies)
  • Passing data from Python to JS is an unique challenge
  • Learning CockroachDB
  • Natural Language Processing is complicated & there any many different options, so difficult to find optimal choices

Accomplishments that we're proud of

  • We got some results, a working webpage/frontend/backend, etc.
  • First time working with this type of raw big data
  • Using and combining a lot of new technologies, like CockroachDB, Reddit API/PushShift, and emotion/semantic analysis

What we learned

  • We learned how to use interesting new tools!

What's next for Said It On Reddit

  • Add additional subreddits to get a wide variety of data
  • Automate full data pipeline and update data as it changes
  • Compare most reactive or popular users from the posts and comments
  • Further data analysis & creative data visualizations

Built With

Share this project:

Updates