Inspiration

We all hate delayed trains, especially when we are on the train and we need to be somewhere. Most of the time, this is due to the infamous excuse of "Bad Engineering". A significant portion of delays is due to the weathering of rail assets. Damage assets can cause minor to major delays which costs both the passengers and the trainline. We want to attempt to present a solution to that problem.

What it does

Our solution uses machine learning to perform regression on the dataset that is provided by network rail. Our solution takes that dataset and predicts which assets are most likely to fail based on both the geographical area of the asset and the corresponding weather conditions. We aim to display this on a heat map on the form of a web app, as browsers are easily accessible to everyone, anywhere. Ultimately, our solution provides more efficient way for engineers to seek out which areas are most critical of weathering, saving both time and money for the trainline.

How we built it

We built the front end, using the React framework by Facebook. The front end will incorporate the heat map, a map that displays the more critical regions of the UK where assets are more susceptible to weathering, taking into account the position and previous weather forecasts by weather stations within that area. We were given a large dataset to infer from. The dataset needed to be cleaned so we broke out jupyter notebook and began to analyse and process the data using the Pandas library in python. Using Pandas, we were able to remove redundant data that would've otherwise obstructed the performance of the machine learning model. Furthermore, we merged several datasets together to find out about certain features such as their coordinates on the map. We also used NLP by extracting descriptive texts about incidents. We then passed that through a Naive-bias classifier which predicts which assets will be prone to failure including the coordinates. First we had accuracy 55%, upon cleaning data we reached 64%.

Challenges we ran into

The main challenge we ran into was processing the large dataset in such short amount of time. We overcame this by splitting parts of the dataset to different people, then reconvening to discuss how the data can be linked and which features can be used to be passed into the model.

Accomplishments that we're proud of

We're proud of working together to solve a problem that will aid the lives of millions of people who use trains. Reducing delays and possible hazards. We're also proud about how we used machine learning a relatively new concept to solve such a task.

What we learned

We learnt the different assets in railways and how railways work. As well as how minor weather conditions can have such a great impact on railway assets. We also learned to think outside the box when it comes to data preprocessing, and how to link data together and synthesise new metrics to glue different datasets together.

What's next for Bring Andy Murray Home

To be continued...

Built With

Share this project:

Updates