Inspiration

We were inspired by the East Banc datasets and wanted to create an algorithm to reduce the waiting time for a taxi.

What it does

Our project analyzes 2019 Taxi data from East Banc Technologies, processing the data into dictionaries that we can analyze. Our algorithm clusters the geographic coordinates using the k-means algorithm and splits the times into 15 minute intervals. For each interval and cluster, we record the number of taxis in that area in a dictionary. We calculate the means of each cluster and put that into another dictionary. Both of these dictionaries are outputted into a text file that is processed by another script to predict the taxi demand for future dates using a robust nonlinear regression algorithm. The analyzed and predicted data is processed and plotted on a map on our web interface.

How we built it

We built our algorithm in three Google Colab Jupyter notebooks and used HTML and CSS for our website.

Challenges we ran into

We ran into challenges with processing the data since there were 12 CSV files for each month of 2019 and each file had over 500,000 rows of data. Additionally, we were unfamiliar with the many python libraries that proved helpful for our project.

Accomplishments that we're proud of

We are proud of creating this project and finishing our algorithms and website.

What we learned

We learned how to make a website using HTML, CSS, and Bootstrap. We were able to grasp how to use the pandas, numpy, and scipy libraries through this project. Lastly, we picked up how to use the many facets of Google Colab.

What's next for Taxi Traffic

In the future, we will look for more data so we can improve the accuracy of our prediction model as well as extending into other popular areas for taxis.

Built With

Share this project:

Updates