After learning that Network Rail spends around £50,000 on every correlation survey for the network for signalling projects alone, we realised that the amount spent on asset tracking (buildings alone exceed 15000 assets), track/train positioning and surveying is phenomenal, not to mention the risk the workforce face while on the network; manually carrying out the dangerous tasks. We thought it would be possible to automate these tasks, improving speed, efficiency and safety performance.
We realised that using neural-networks alongside computer vision and modelling technology we could automatically tag assets such as signal, signage, and buildings while using the extra information from the images to improve the location and smooth the GPS all at once!
What it does
The GPS smoothing was done by creating a 3-Dimensional scene from the forward-facing video footage. By creating an actual model of the scene (a point-cloud) we are able to better understand the actual motion of the train and with that calculate the expected position. This raw GPS data is then compared with this expected position and is fixed as needed.
The asset detection was done by training a Convolutional Neural Network to perform Object Detection. Due to issues with the provided data we had to annotate the data ourselves. Due to time restrictions we only annotated train speed limits as we are more familiar with them. The trained model can automatically detect speed limits. More annotated data will allow us to detect a plethora of assets, such as signs, signals, generic assets, buildings, workers, etc.
The end product will be an automatic asset management system which (with the fixed GPS data) can accurately and precisely detect assets and record their location in the network.
How we built it
Fundamentally it’s about creating a always up-to-date virtual (and cheap) representation of the UKs rail network, including the physical location, structure and classification of rail network assets.
Practically this isn’t possible without accurate GPS tracking of trains and spatial mapping tools. To tackle this we developed a system that analyses the train’s video stream and generate a 3D point cloud of the immediate environment. From this point cloud we can improve our estimation of GPS and ultimately generate a more accurate GPS signal.
In addition to this work we developed a machine learning pipeline that can recognise rail network assets in images (such as speed signs). With our 3D environment model and accurate GPU we can then plot 3D annotations of rail network assets in a virtual representation of the rail network.
In principle, this can be updated every time a train passes through the network
Challenges we ran into
Early on in the hackathon we realised that we had very limited training data therefore in order to train the convolutional neural network that we used to classify the lineside assets we firstly needed to use more traditional machine vision techniques in order to generate training data for the neural network. In addition it was a great challenge to undertake more than a single problem posed by network rail. Although this was a lot to take on it made us better able to distribute the workload as a team.
Accomplishments that we're proud of
Finding a solution to the GPS smoothing! The GPS positioning was completely inaccurate for long stretches, using the video this was amazingly well corrected, sticking to each individual track. Moreover, we're pretty proud that the trained CNN model can detect train signs better than we can!
What we learned
In solving both of the network rail challenges we learnt a huge amount about advanced data processing techniques for both geolocation and image processing. In order to smooth the location of the train we had to apply kalaman filters in order represent the location of the train in a probabilistic model. In addition we learnt about the kinematics of train motion and limitations of rail curvature and train acceleration. This information was all used to infer the accurate location of the train. At the intersection between video processing and geolocation we learnt about sensor fusion in order to generate a point cloud from the 2D video data. When it came to image detection we learnt about standard image processing techniques for fitting based on logic and convolutional neural networks in order to process more complex data.
What's next for Neural Railworks
The solution needs to be implemented, and could be used with further input of sensory data to provide an early warning system to drivers, alerting of speed limits, missed signals, people on the track or other hazards. Eventually the system could be used to:
- Automate the trains working with an onboard signalling system
- Completely take over trackside manual surveying
- Real time early warning system
- Immediate asset logging, as soon as a sign/signal has been erected the log is updated when the next train passes
The solution could be improved by using:
- Better cameras
- Lower latency in exposure adjustments
- Position cameras to pick up more of the track
- Depth sensitive cameras
- Dual cameras
- More parameters provided to Kalman Filter (sensor fusion) to improve smoothing