Soccer teams make logistical decisions, such as firing managers, signing players, and increasing/decreasing ticket prices as a result of fan satisfaction. This may or may not have to do with wins or losses, and actually has stronger correlation with fan sentiment. Reddit is a medium where the ideas and opinions of fans are aggregated and organized such that the most shared opinions can be found. Utilizing the Reddit API and PRAW, we can extract this data off the subreddits for the top 6 teams in the English Premier League, analyzing the post match threads over the course of a few seasons, to see how fan sentiment can help us predict these important management decisions.

What it does

Visualizes sentiment analysis for current top 6 premier league teams based on Reddit. Each data point has an x-coordinate of time, and a y-coordinate value of the polarity measurement for each team based off weighting and averaging the polarity rating for the top 5 comments for that specific post match thread.

How we built it

Used PRAW to access reddit api in python and Textblob to utilize Natural Language Processing to perform sentiment analysis on every post match thread to show fan sentiment over time

Challenges we ran into

Sentiment analysis is a non-trivial algorithm and does not have perfect results. Data visualization in javascript was difficult to combine with django back-end


We believe this tool can show some really interesting and useful predictions for how public reception as a whole can influence major decisions. This is a tool that doesn't simply have applications in analyzing soccer teams, but sports teams as a whole, and extending upon that, other communities with strong subject opinions (such as in finance or health policy).

Built With

Share this project: