CodeJam DataDive 2017

Category: Transportation
Datasets used: NYC General Transport
Team: Erin O'Neill, Anudruth Manjunath, Brendan Kellam, Eliott Bourachot


Seeing the huge number of accidents in New York City, we wanted to give people the power of chosing the safest route when travelling.

What it does

Our app receives suggested routes from google’s api and, taking our 2014 vehicle collision data into account, calculates and returns to the user the safest route by avoiding as many streets with high densities of accidents as possible.

How we built it

Using Python and a variety of Google APIs, we first took the data provided by the CodeJam DataDive team and created a heatmap to accurately represent the most dangerous parts of New York City. Then, through the interpolation of waypoints, we managed to obtain enough data to count the number of accidents along a specific route. We then assigned a safety rating to each of the suggested routes obtained through the Google Directions API. The route with the best safety rating is then displayed as the safest route.

Challenges we ran into

Analyzing our data and deciding on a project proved difficult as we wanted to find a way to use the data sets to improve a consumer’s everyday life. We also spent a lot of time configuring our data structure for storing the vehicle collision data—we needed to be able to access this information quickly and efficiently, and we needed to avoid having to load the data in real-time.

Accomplishments that we're proud of

Since Google’s API only returns waypoints of a route, we needed to construct more nodes to define a path. Our interpolation algorithm recursively determines the midpoint between two input latitude-longitude pairs (initially two waypoints) and forms each midpoint to a road using Google’s Snap to Roads API. We also compressed the vehicle collision data into a 2D array which maps all of the collisions in NYC onto a grid. Our program takes an input latitude-longitude pair (a node along our route), maps it to its respective grid index, and returns the safety score of the grid in which the node lies. This allows us to define a safety score for each route (ie the total score of each node).

What we learned

Through this experience we learned the immense power of data and the benefits that could be obtained from properly using it. In order to interface our Python code with the Google APIs we had to learn about their in-depth functionality and how to manipulate the route data to the advantage of our algorithms.

What's next for SafeTrip

We plan to improve our interpolation algorithm to control the number of nodes constructed in between two way points based on their distance from one another. We will implement a function such that as the distance between two waypoints increases our program interpolates more nodes, all while ensuring the scale for smaller distances is still reasonable. We also will offer the user the ability to scale their desired priorities for safety and time.

Built With

Share this project: