While participating in the BLM protests in Seattle, I was awestruck by the activism fervor and appalled by the police response to peaceful protesting. I saw tear gas being launched into massive crowds and batons cracking down indiscriminately. I wondered if Washington was an anomaly or was police response equally violent in all states.
What it does
This project chronicles BLM protests in the entire United States and the respective police response to each protest. While, it was difficult to find data for police response to all 33,117 protests, I was able to find other useful data like when was the protest hosted, how many people showed up, and what was the protesting route.
How we built it
The project had three steps: data collection, wrangling, and visualization. I collected data using web crawlers and ignored data points which did not specify where the protest was held. Data wrangling consisted of appropriately classifying raw input into individual categories such as: City, State, Latitude, Longitude, Date, and Information about the protest itself- used Google Maps API to find latitude and longitude. Lastly, I used Plotly Express to visualize all the information onto a map of the US.
Challenges we ran into
Transforming raw data into a tidy table was probably the most challenging/ time consuming process because websites had their own idiosyncratic formats for arranging information and key values like state and city might sometimes be embedded in text. This was problematic because I had to write algorithms using Beautiful Soup to parse HTML and extract these values. Secondly, it was difficult working with a dataset that was this big because VSCode would crash if I did something silly like printing the whole dataset, or any operation that took up too much memory.
Accomplishments that we're proud of
I am proud that I asked a relevant question about our society and I was able to use my data science skills to satisfy my curiosity.
What we learned
This project helped me exercise all my data science skills from acollection all the way to visualization. I was acquainted with new APIs like the Google Maps and Plotly express- which I did not have a chance to work with before- and I further refined my skills in the Python language.
What's next for Protest Locations
While I didn’t envision the project to grow beyond its current scope, I suspect this will be a valuable resources for journalists and reporters who write about BLM protests, and perhaps even policy makers who make decisions on this topic. In addition, I intend to make this data publicly available on Kaggle so that people can use it in other creative ways and answer their questions about BLM protests.