Social media has quickly become the primary source of news for many around the world and the prevalence of fake news has grown with it. With a range of recent natural disasters and the spread of Coronavirus(COVID-19) increasing the stakes of acting on all available information, more and more individuals are beginning to fall victim to lies spread by fake news. Being an interesting issue to address, this led us towards discovering a market gap for a tool to aggregate, filter, and visualize global social media activity over a period of time. Our project, Re:Action, aims to empower users by providing them access to an aggregated and formatted set of tweets from around the world. This allows them to become aware of inconsistencies characteristic of fake news stories in relation to what the larger majority reports, allowing fully informed before they take action on the ever-rapidly changing situation of our world.
What it does
Users are able to query keywords and visualize any relevant tweets according to their geographical locations around the world. The keyword is used as a search parameter for both News Article and Tweet scrapers, where a large collection of relevant elements of information is captured. Using type meta-data, geo-location and element information is extracted and then subsequently displayed in the corresponding location on the web app's world map.
How I built it
We began by planning and wireframing our project using Figma.
For the back-end, we used Python & [Tweepy/GetOldTweets3] to implement our web scrapers and data processing. After serving our multi-threaded scripts with Flask, all relevant information was stored in a MongoDB Atlas database, where it was then called upon and displayed in my residence.
Challenges I ran into
The official Twitter Search API that we used severely limited the number of calls we could make (maximum 300 queries for 18 requests every 15 minutes), which made obtaining a large enough data set to train our machine learning model difficult. Many of the tweets included incomplete or improperly formatted location data, which made the visualization/plotting process of the set of elements difficult. This forced us to rely on other methods in identifying a suitable map location for the elements collected.
Accomplishments that I'm proud of
Some of the libraries featured in the front end of our web app were made from scratch.