Predicting Capital Bikeshare bike availability

Competition : Hippo Hacks 2018

This is my submission to the Hippo Hacks 2018 Hackathon held at The George Washington University sponsored by Google.

Predicting the number of bikes available at a dock is a much more complex problem than it appears. Capital Bikeshare keeps restocking dock stations with bike which have a low number and generally are used by a lot of cyclists. The project here aims to take that restocking into account and Predict the number of bikes that would be available at any given time.

The challenge while making was project was having access to data and creating a realtime solution. The current historical data provided by Capital Bikeshare consists of details for each individual trip, whereas the reason to go ahead with this project is to predict the availability of bikes at the stations during the selected time of the day.

The backbone of the app rests on 2 cronjobs running constantly.
The first, downloads real-time data for all the bike dock stations (500 stations) in DC every 5 minutes from the Capital Bikeshare data feed. This data is fed into Google CloudSQL storage.

The second, runs two Machine Learning models (RandomForest Regressor and Neural Network) every hour on the entire data residing in the Google CloudSQL storage. The script saves the weights it calculates for both the models every hour.

A Flask application is then deployed to read user input (Date, time, Location of station) to predict the number of available bikes at the specified station using the updated and latest weights for both models, and then gives out an average of the predicted number of bikes.

The application is deployed on Google Cloud Platform and for security purposes the application has been set to be accessible from within The George Washington University ip address ranges only.

An advantage of using this approach is as we move forward in time (every hour), the accuracy for the prediction will keep increasing because both the models will have more data points to train on.

Link: http://35.227.32.91:5000/
Github: https://github.com/anshgandhi/Predicting_Capital_Bikeshare_bike_availability_realtime

(an updated link is available on the Github Repo)

anshgandhi16@gmail.com

Share this project:

Updates