Inspiration
We’re all avid reddit users and we’ve seen how much reddit as a platform can encourage people to act offline
Users attacking other users is a online social problem that is still ongoing today
Trolling, bullying, harassing, and demeaning online have demonstrable harm
What do we want to achieve?
Create an application that measures how toxic/harmful one redditor can be towards others by analyzing reddit comments and subreddit posts
Tech Stack
- React.js - Display data
- CockroachDB
- Node.js, Express - Store data and bring to front end in a scalable way
- PushShift API (Extracts subreddits, submissions, comments ...etc from Reddit)
- Python packages such as Tensorflow, RoBerta
- Visualization tools such as matplotlib
Challenges we ran into
- We had difficults configuring some of the libraries (ex.Python package dependencies)
- Passing data from Python to JS is an unique challenge
- Learning CockroachDB
- Natural Language Processing is complicated & there any many different options, so difficult to find optimal choices
Accomplishments that we're proud of
- We got some results, a working webpage/frontend/backend, etc.
- First time working with this type of raw big data
- Using and combining a lot of new technologies, like CockroachDB, Reddit API/PushShift, and emotion/semantic analysis
What we learned
- We learned how to use interesting new tools!
What's next for Said It On Reddit
- Add additional subreddits to get a wide variety of data
- Automate full data pipeline and update data as it changes
- Compare most reactive or popular users from the posts and comments
- Further data analysis & creative data visualizations
Built With
- cockroachdb
- express.js
- node.js
- python
- react
Log in or sign up for Devpost to join the conversation.