Inspiration
Reddit has grown in popularity over the years. Many of the platform's most popular Subreddits are plagued with toxic comments and posts, harming the user experience for many people. We have delivered a web application that informs users on the toxicity of a Subreddit, as well as providing valuable data for comparing them.
What it does
Estimates the toxicity of posts and comments of a given subreddit, using the capabilities of Cohere's large language models, on results from the PRAW wrapper for Reddit's API. Toxicity scores, and metadata, are stored using SQL server for comparison across different forums.
How we built it
Web app development using Flask on the backend and Solid on the frontend. Toxicity metrics computed using Cohere's API to perform text classification, using examples from Surge AI's "Social Media Toxicity Dataset". Reddit post and comment scraping using PRAW. Data storage in an SQL Server database hosted on Microsoft Azure.
Challenges we ran into
Reddit API responses formatting and handling. Connecting the front-end to the back-end.
What's next for RedditSafe
Adding more subreddit analysis data such as misinformation rate/likelihood to allow for a more in depth understanding.
Log in or sign up for Devpost to join the conversation.