We wanted to do an analysis on 10 years of Iowa City arrest records. We had many ambitious ideas on using this data such as geo-location to identify high crime areas, correlations between University of Iowa home football games, and measuring various aspects to identify changes in crime during the school year.

How we built it

We first tried to build on the arrest data we had, extracting age, day of the week when the arrest occurred, time of arrest, etc. Ultimately Luke and Ryan performed most of the raw data cleaning, and I was working on incorporating the data within Tableau. Through this analysis we noticed quite a few anomalies with the data set we had.

Challenges we ran into

There were many entries that didn't make sense and ultimately we determined there where many errors and inconsistencies in the data, such as misspellings, wrong dates of birth, and there didn't appear to be a standard for entering type of crime. When attempting to do Geo-location, we noticed many of the addresses where the arrest was reported were not able to be Geo-located. We also had various issues

Accomplishments that we're proud of

Ultimately we were able to gain some insight into the arrest records, and

What we learned

I have never really worked on anything like this, and learned quite a bit about working with Big Data. I also learned to use a new tool (Tableau) which was challenging and rewarding.

What's next for bd-hack

We are going to make visualizations and analyze various aspects in more depth. 1) How does game day affect Iowa City crime? 2) Try to gain insight into differences between each game (if it matters who the opponent is, the time of year, etc). 3) Ultimately uncover any previously missed patterns in the data.

Built With

  • jupyter-notebook
  • r
Share this project: