Each year, 132 people die in vehicle-related fatalities in the US Waste & Recycling industry with 98 of these being civilians. Each fatality costs these companies ~$1.5m each in direct costs. There are also thousands of other incidents resulting in environmental disaster, property damage and serious injury. We picked this industry because their operations include high-risk activities (e.g., driving large trucks with many stops and starts) in uncontrolled environments (e.g., weather, traffic, route hazards, pedestrians, etc.) where members of the community are at risk.

The main risk drivers are people-related (e.g., experience of driver, their driving record, etc.); systems-related (e.g., time since last maintenance, age of truck); and, environment-related (e.g., weather, high-risk route, school zones, etc.). Our product, BSafe, is a predictive analytics SAAS which pushes a notification to truck drivers through Twilio when the combined risk factors result in a heightened probability that a serious incident will occur. Drivers can then take corrective action such as slowing down or taking a safer route to complete the job.

BSafe reduces the probability of serious incidents from happening. This saves companies millions dollars each year in direct costs; improves employee morale and company culture; and, improves sustainability in terms of reduced environmental impacts.

What it does

BSafe is a machine learning model based on neural networks and XGBoost. It takes data from a diverse range of public sources (e.g., Google Maps,, etc.) and private sources (e.g., driver data, truck data, etc.) of all accidents along 6 routes over the last 5 years totaling 2,443 accidents and 3 fatalities to train the model to identify combinations of risk factors that are associated with a high probability of a serious incident occurring. Across all of Boston there have been 26,307 accidents and 129 fatalities in the last 5 year indicating one fatality for every 200 accidents.

The company can then set its own risk threshold to determine when a driver should be notified that they have entered a high risk situation.

How we built it

First, we use Python and Keras to build a machine learning model using a neural network to determine a safety score for a combination of routes, drivers and trucks. We then performed a cross-validated grid search across various settings for each model to determine the optimal hyperparameters. The model was trained on a Google Cloud Virtual Machine (8 vCPUs, 30 GB Memory). We scraped the database for historical incident data across 6 routes. We also gathered and processed historical weather data to improve the predictive power of the model.

Second, we set up a safety trigger threshold in the model which can be tailored for each company based on their individual risk tolerance. When the safety risk threshold is triggered, the model in Python sends a text message via Twilio to the driver’s phone.

Third, we configure the driver’s phone so that text messages received from Twilio are automatically read out loud by Google Assistant notifying the driver that they have entered a high-risk state and giving them a notification on how to adjust their behavior based.

Finally, the risk profiles of all drivers are displayed graphically on a dashboard powered by Plotly│Dash. This allows management to monitor driver safety in ‘real time’ and take preventative action before an incident occurs.

Challenges we ran into

We initially begun using the Firebase platform to display the initial output from the model because it provided certain attractive features such as mapping individual stops made by the driver as well as running a simulation of the truck route over time. However, we ran into significant challenges in trying to adopt open source code from Firebase to our input data. As a result, we decided to use Dash in the interests of time.

Accomplishments that we're proud of

We are very proud that we were able to build such a complex machine learning model employing a neural network and which drew data from multiple different public and private sources under a constrained timeline.

We’re also proud that we were able to get a market-ready product that could be deployed ‘in the field’ tomorrow using predictive analytics to change human behavior in ‘real time’ with the aim of saving lives and reducing man-caused environmental incidents which lead to a more sustainable world.

What we learned

We have learned how to put together pieces of data from multiple sources and APIs together in a clean and standard way to train advanced models. We have also learned about a great new tool for notifications (Twilio) and were able to notice the improvements from using Google Cloud VMs to train deep learning models.

What's next for BSafe

In the future, we would like to would like to fully leverage all of Google’s capabilities and APIs to provide management with more insightful ‘real time’ analytics on safety (e.g., Firebase). Another opportunities is to link into MassDot API for historical traffic information and potentially real-time traffic camera images for improved risk analysis.

We would also like to the model using actual data from a Waste & Recycling company.

Share this project: