Election day 2016 - shock across the country as the electoral votes were tallied. How had polls, projections, newspapers and experts all mis-predicted the results. Perhaps, they didn't have a true sense of the political pulse of the country. By utilizing tweets, which serve as a reflection of the thoughts, feelings, and ideas of people around the globe, and analyzing the tweets one may glean otherwise overlooked information. On election day, twitter is especially active, with handles voicing opinions regarding their parties and favorite candidates. With so much data, our team was intrigued by the possibility of predicting the leading party in a state based on the thoughts and feelings published in America's twitter feeds.
What it does
Our project looks at over 400,000 tweets from election day (November 8th) 2016. Our data extraction process extracts tweets on that day and 32 features of each tweet such as the user who posted the tweet and where the tweet was posted from. Utilizing natural language processing (NLP) and a pre-trained neural network, we assigned each tweet a political ideology score, ranging from 0 to 1 where 1 indicates Democratic and 0 indicates Republican. We aggregate the tweets by state and hour to get a real-time experience of how the political sentiments on Twitter progressed throughout election day. Using the aggregated score for each state we then predict the overall result of the election by predicting the allocation of electoral college votes. Our front end design displays this data on an interactive map of the United States which updates temporally.
How we built it
Challenges we ran into
Some challenges that arose during the development process were the cleaning and pre-processing of the data since the formats were not standard. Specifically, we wanted the ensure all the data captured was appropriated stratified and relevant. Another challenge we faced was the sheer volume of data we were handling and training our model on. In order to meet the time-constraints we parallelized the process. Finally, due to the subjective nature of political ideologies it was challenging to find an appropriate dataset to train the model on with accurate labels.
Accomplishments that we're proud of
We are extremely proud of jointly creating an application with real-world applications and successfully solving many complex, challenging problems. It is exciting that in such a short time span we were able to create a neural network with 75% classification accuracy. Most importantly, however we are proud of working together to create a beautiful visual representation of extremely nuanced and relevant data.
What we learned
Our team truly worked together and every member, from our data processing specialist to our front-end developed, learned and contributed to various aspects of the project. We discussed decisions, challenges, and goals as a team and learned everything from how to train a neural network to how many electoral college votes Vermont has.
What's next for Tweet to Vote
We hope to start pulling real-time Twitter data for the U.S. and constantly updating the map to represent the political climate of the nation. Although, there are several challenges in doing so, we are excited to being tackling the task.
Note: Please open in Firefox at 80% zoom