Inspiration

When given the broad theme of city life, our group decided on safety as the number one issue. We decided to focus on crime, as many tourists and newcomers to a city are uninformed about safe and unsafe areas.

What it does

StatCrime provides crime insights for 5 major cities, which are Houston, Los Angeles, Chicago, New York, and Washington, DC. Each city page provides a predicted average daily safety score for the most dangerous area(which was determined by the volume of crime in dataset coordinates) as well as a graph displaying the hourly predicted safety score. These two metrics are scored from 0-100, with 0 being the least safe and 100 being the most safe. Under those statistics, we have two heat maps that display crime hot spot areas in a city. One just displays an overlayed heat map, which shows the high-risk areas. The other shows a more detailed map, with a key displaying what safety score constitutes to what color. We scored this from 0-1, as we were just showing a density function.

How we built it

For the front end(visual representation), we utilized React, Tailwind CSS, Shadcn UI, and Framer Motion to create an interactive, stylish UI with clean animations and an organized layout to display the homepage, features, and stats for cities. For the machine learning metrics, we used Pandas for data analysis/processing, NumPy to work with large datasets, Scikit-learn to use our chosen ML algorithm(Explained later), matplotlib to plot the graphs, and GeoPy to get a range of coordinates.

Our chosen machine learning model was called K-Nearest-Neighbors, which uses proximity to make clusters using individual data points. Using the 'KernelDensity' module from sci-kit-learn, we used our dataset to create a 'heat map' that gives the densities of crime in certain areas. We also used that dataset to get the safety score for the most crime-dense location, where we used geopy's 'great_circle' method that calculates the range of coordinates of that area considering the earth's spherical shape. Our data was from government websites, and each city's public data page. We had to not use a large number of records in the dataset given the constraint of time.

Challenges we ran into

Each dataset formatted certain parameters differently. For example, LA had two separate columns for date and hour, while other datasets for Houston and DC had one column for date and time. Moreover, headers and data values were different, so we had to take that into account when making our models. This led us to make separate models for each city. We also could not use entire datasets which were around 50000 records or more, as it would take too long to train the model. We had to cap it for time-saving reasons.

Accomplishments that we're proud of

We are proud of the beautiful UI that we have created, as well as the ML models that made reasonably accurate predictions. Our team was able to work well together, dividing skills to get the project done.

What we learned

In the future, we want to scale this app to host any location's crime statistics. To do so, we would like to utilize a universal API(if it exists) that has structured and similar crime data for each city so that we can use CRUD operations to call the API and make our website more interactive, dynamic, and nationally/globally scaled. We would also like to utilize this API to allow users to give a zip code and convert that to latitude and longitude, or whatever coordinate set this potential API uses.

What's next for StatCrime

Coming into this project we had prior experience in both frontend and backend technologies such as React, CSS, Node.js, and Javascript. However, in the past 24 hours, most of us gained experience in machine learning processes such as data processing, data training, and testing, as well as cross-checking the accuracy of our data. This project challenged us to the limits as we had to learn an entirely new technology and methodology in a short period as well as improvise based on the materials and knowledge that we had.

Built With

Share this project:

Updates