Machine learning is widely used everywhere, including weather forecasting. Observations presented for the competition allowed us to address an important issue of safety on the roads. Additional weather data sources were used:
What it does
The final product is a mathematical model that can be used in a mobile or web application to predict the risk of icing of the road and inform drivers:
- Check the fresh weather report
- Put data into the model
- Predict road friction for the next N hours
How we built it
- Historical weather data was collected using public API and preprocessed.
- The central part of the project is a powerful and robust XGBoost regressor.
- We use Bayesian optimization provided by Optuna library for hyperparameters search.
Challenges we ran into
The most complicated step was data preprocessing and feature selection. From the original data, we decided to use only Friction observations as the targets of our model, because public and personal transport are usually not equipped with a set of sensors used to collect the Smart Road Measurements dataset. We want to provide the opportunity to get reliable forecasts based on publicly available data, hourly provided by weather stations.
Accomplishments that we're proud of
We discovered the difference between time-series forecasting and time-series classification. Also, we achieved a solid ~76.67% weighted accuracy, which is very impressive for such difficult data and a short period of observations.
What we learned
We have achieved a deep understanding of the mathematical background, tools, and approaches to time-series data. I lost part of the code at an early stage of the project, so I received a boost of motivation to automate the routine of backups and version control. I became wiser. - Andrii
I learned about time-series analysis, data preprocessing, dealing with outliers, collection of the relevant real-world datasets, and a unique regression model for the first time and that's my achievement. - Abrar
What's next for Road Friction Forecasting
With a large number of observations and data sources, we can face even more challenging problems.
Pandas Profiling Demo: https://krutsylo.neocities.org/SmartRoads/pandas3.html