Title:

Mapping vehicle loads to bridge by using object detection based on Yolov4

Our Team:

Yijia Zhang, Sijie Ding, Canjie Sun, Teresa Amor

Introduction

Bridges and skyways are common components of our transportation system. To ensure they are in sound structural condition is critical for safety. Structure health monitoring(SHM) is a system to measure and track the health of bridges. Although it has been developed for decades, it is hard to popularize due to the expensive equipment. We introduce a more approachable method to estimate the live loads of bridges based on recognizing the quantity and categories of vehicles on the bridge. With the aim to estimate the live loads of a bridge or a skyway, we set up two traffic cameras on both sides of Wang Village Bridge to collect raw video data. We label the video data to create our own training/developing/testing dataset. Then we use computer vision techniques based on the Yolo4 model to identify the categories of vehicles.

Related Work

https://github.com/haroonshakeel/tensorflow-yolov4-tflite

Yolo4 paper: https://arxiv.org/pdf/2004.10934.pdf

Similar project using yolo3 and different data: https://www.sciencedirect.com/science/article/pii/S0968090X1930957X

Data

We are using data collected from Hangzhou Ruhr Technology Ltd. This company establishes platforms to assess the risk and provide solutions in the fields of emergency, natural resources, transportation, housing construction, water conservancy, energy, culture and tourism, ancient architecture and other areas and regions. The raw data includes video footage of different kinds of vehicles traveling on bridges and highways. We set up two cameras on both sides of Wang Village Bridge to gather raw data. The bridge is close to a construction site, and there are a lot of trucks going across the bridge. So we want to measure the load on the bridge. The videos are 1 GB per hour and will require some amount of preprocessing. Some of the data has already been labeled but the rest will require removing frames that do not include vehicles and going through and labelling the data with vehicle types (based on number of axles and number of wheels). The number of axles and wheels are relevant for determining the approximate weight and weight distribution of the vehicles.

Methodology

We will be using transfer learning with Yolo4 to train our model. We want to use Yolo4 because Yolo4 is known to be good for real time object detection, and so by adding our own layer to the Yolo4 architecture we are hoping to be able to specialize its real-time detection to our specific purpose. If we run into problems we could try experimenting with transfer learning with other models instead of Yolo4, or if we have extra time we could experiment with different models and compare their performances.

Metrics

We plan to test our model by running the model on unseen testing data. We are interested in both the accuracy of the model’s predictions as well as the speed with which the model makes those predictions, since a fast model is important for real-time detection. In this case, accuracy will be how well the model predicts the correct label for the vehicles in the data images/videos. With respect to speed, we would want the model to be able to make real-time predictions about the classification of the vehicles in the videos before the vehicle exits the frame of the video.

Our base goal is to process our data and be able to run YOLO4 with our new dataset. Our target goal is to be able to apply transfer learning to YOLO4 to create a more accurate model for the type of data we are classifying. Our stretch goal is to experiment with and compare different types of model architectures and to possibly work on recognizing individual vehicles based on license plates rather than just the type of vehicle.

Ethics

Why is Deep Learning a good approach to this problem?

The equipment of the SHM system is expensive. It is infeasible to apply SHM to all small/medium scale bridges like Wang Village Bridge. However, if we apply computer vision techniques to recognize vehicles and estimate the load of the bridge, we only need to set up 2 or 4 cameras on both sides of the bridge.

What is your dataset? Are there any concerns about how it was collected, or labeled? Is it representative? What kind of underlying historical or societal biases might it contain?

We collect raw video data from two cameras which are set up on Wang Village Bridge. There are times when the data is not very good, such as rain, late afternoon (the car is too bright). The bridge is close to a construction site; The vehicles are mainly trucks, but there is a shortage of other types. So the model we built probably is only suitable for this scenario, lack of universality, mainly due to the lack of diverse data.

Division of labor

  • everyone works on preprocessing data(filter images and label them)
  • everyone does Yolo4 transfer learning + design and play our own layers.
  • everyone does transfer learning on another model individually and compares the performance with Yolo4.

Continue updating...

Final Writeup

writeup-link

Built With

Share this project:

Updates

posted an update

Check-in #2 Introduction (from proposal): Bridges and skyways are common components of our transportation system. To ensure they are in sound structural condition is critical for safety. Structure health monitoring(SHM) is a system to measure and track the health of bridges. Although it has been developed for decades, it is hard to popularize due to the expensive equipment. We introduce a more approachable method to estimate the live loads of bridges based on recognizing the quantity and categories of vehicles on the bridge. With the aim to estimate the live loads of a bridge or a skyway, we set up two traffic cameras on both sides of Wang Village Bridge to collect raw video data. We label the video data to create our own training/developing/testing dataset. Then we use computer vision techniques based on the Yolo4 model to identify the categories of vehicles.

Challenges: So far we have had some difficulties with communicating how we are going to work on the project, especially since we are in different time zones. This has made coordinating our efforts a bit difficult, but this should get easier as we have more time to dedicate to the project.

Insights: At this point in time we do not have concrete results to show.

Plan: We need to dedicate more time to setting up the architecture of the model itself so we can get to a point where we can run it and achieve results. Going forward we will be able to focus more time, allowing us to work on setting up and tuning our model architecture.

Log in or sign up for Devpost to join the conversation.