Inspiration

Reddit has grown in popularity over the years. Many of the platform's most popular Subreddits are plagued with toxic comments and posts, harming the user experience for many people. We have delivered a web application that informs users on the toxicity of a Subreddit, as well as providing valuable data for comparing them.

What it does

Estimates the toxicity of posts and comments of a given subreddit, using the capabilities of Cohere's large language models, on results from the PRAW wrapper for Reddit's API. Toxicity scores, and metadata, are stored using SQL server for comparison across different forums.

How we built it

Web app development using Flask on the backend and Solid on the frontend. Toxicity metrics computed using Cohere's API to perform text classification, using examples from Surge AI's "Social Media Toxicity Dataset". Reddit post and comment scraping using PRAW. Data storage in an SQL Server database hosted on Microsoft Azure.

Challenges we ran into

Reddit API responses formatting and handling. Connecting the front-end to the back-end.

What's next for RedditSafe

Adding more subreddit analysis data such as misinformation rate/likelihood to allow for a more in depth understanding.

Built With

Share this project:

Updates