Title:
Mapping vehicle loads to bridge by using object detection based on Yolov4
Our Team:
Yijia Zhang, Sijie Ding, Canjie Sun, Teresa Amor
Introduction
Bridges and skyways are common components of our transportation system. To ensure they are in sound structural condition is critical for safety. Structure health monitoring(SHM) is a system to measure and track the health of bridges. Although it has been developed for decades, it is hard to popularize due to the expensive equipment. We introduce a more approachable method to estimate the live loads of bridges based on recognizing the quantity and categories of vehicles on the bridge. With the aim to estimate the live loads of a bridge or a skyway, we set up two traffic cameras on both sides of Wang Village Bridge to collect raw video data. We label the video data to create our own training/developing/testing dataset. Then we use computer vision techniques based on the Yolo4 model to identify the categories of vehicles.
Related Work
https://github.com/haroonshakeel/tensorflow-yolov4-tflite
Yolo4 paper: https://arxiv.org/pdf/2004.10934.pdf
Similar project using yolo3 and different data: https://www.sciencedirect.com/science/article/pii/S0968090X1930957X
Data
We are using data collected from Hangzhou Ruhr Technology Ltd. This company establishes platforms to assess the risk and provide solutions in the fields of emergency, natural resources, transportation, housing construction, water conservancy, energy, culture and tourism, ancient architecture and other areas and regions. The raw data includes video footage of different kinds of vehicles traveling on bridges and highways. We set up two cameras on both sides of Wang Village Bridge to gather raw data. The bridge is close to a construction site, and there are a lot of trucks going across the bridge. So we want to measure the load on the bridge. The videos are 1 GB per hour and will require some amount of preprocessing. Some of the data has already been labeled but the rest will require removing frames that do not include vehicles and going through and labelling the data with vehicle types (based on number of axles and number of wheels). The number of axles and wheels are relevant for determining the approximate weight and weight distribution of the vehicles.
Methodology
We will be using transfer learning with Yolo4 to train our model. We want to use Yolo4 because Yolo4 is known to be good for real time object detection, and so by adding our own layer to the Yolo4 architecture we are hoping to be able to specialize its real-time detection to our specific purpose. If we run into problems we could try experimenting with transfer learning with other models instead of Yolo4, or if we have extra time we could experiment with different models and compare their performances.
Metrics
We plan to test our model by running the model on unseen testing data. We are interested in both the accuracy of the model’s predictions as well as the speed with which the model makes those predictions, since a fast model is important for real-time detection. In this case, accuracy will be how well the model predicts the correct label for the vehicles in the data images/videos. With respect to speed, we would want the model to be able to make real-time predictions about the classification of the vehicles in the videos before the vehicle exits the frame of the video.
Our base goal is to process our data and be able to run YOLO4 with our new dataset. Our target goal is to be able to apply transfer learning to YOLO4 to create a more accurate model for the type of data we are classifying. Our stretch goal is to experiment with and compare different types of model architectures and to possibly work on recognizing individual vehicles based on license plates rather than just the type of vehicle.
Ethics
Why is Deep Learning a good approach to this problem?
The equipment of the SHM system is expensive. It is infeasible to apply SHM to all small/medium scale bridges like Wang Village Bridge. However, if we apply computer vision techniques to recognize vehicles and estimate the load of the bridge, we only need to set up 2 or 4 cameras on both sides of the bridge.
What is your dataset? Are there any concerns about how it was collected, or labeled? Is it representative? What kind of underlying historical or societal biases might it contain?
We collect raw video data from two cameras which are set up on Wang Village Bridge. There are times when the data is not very good, such as rain, late afternoon (the car is too bright). The bridge is close to a construction site; The vehicles are mainly trucks, but there is a shortage of other types. So the model we built probably is only suitable for this scenario, lack of universality, mainly due to the lack of diverse data.
Division of labor
- everyone works on preprocessing data(filter images and label them)
- everyone does Yolo4 transfer learning + design and play our own layers.
- everyone does transfer learning on another model individually and compares the performance with Yolo4.
Continue updating...


Log in or sign up for Devpost to join the conversation.