In the upcoming midterm elections, many different factors play a role in deciding the winner. It is undeniable that a candidate's social media is a valuable tool to garner support and enthusiasm. We were interested in the different social media interactions of candidates, as it may indicate grassroots support or general popularity. Unfortunately, no such service existed that analyzed tweets involving a given candidate. All services only included followers, likes, and retweets. We created a tool to track politicians’ social media popularity over time, based on engagements with other Twitter users.
What it does
It analyzes all past tweets mentioning all senatorial candidates. With our algorithm, we weight the engagements to determine grassroots support so that more recent Twitter engagements are exponentially more important than older ones. We then use this to make a prediction on who will win their given election.
How we built it
To download the tweet data, we scraped Twitter’s advanced search, using BeautifulSoup in Python to read the HTML and Selenium to scroll automatically. This was computationally intensive, so we multi-threaded our function, running on a virtual machine using Google’s Cloud API. We then displayed our data using graphs and maps in D3.js. In the end, we created our form of a “prediction”, based on tweets. We made a weighted average, exponentially weighting the newer weeks greater.
Challenges we ran into
As a result of the many queries to fulfill all data points for all candidates, the collecting of data with our algorithms took hours to query, even with multithreading. It was also a challenge to map this data using d3.js.
Accomplishments that we're proud of
Our model worked as intended evaluating each senatorial election. The results, while only incorporating one metric, compare competitively to those of FiveThirtyEight and various other polling sites, indicating a validation of our initial hypothesis that social media engagement is a determinant of political success.
What we learned
We learned how to create our own API using Python and Selenium. We also learned how to use Google Cloud's platform to conduct many searches given our data sets.
What's next for Twitter Election Predictor
In the future, we would like to incorporate polling into our model. For example, if a candidate polls 40% with 60% of the Twitter engagement, and the next week the polls 45% with 70% of Twitter engagement, we may extrapolate to predict future results.