Traffic Sign CNN Classifier

Sebastian Martino posted an update — Dec 09, 2021 05:30 PM EST

Title:

Who:

Sebastian Martino

Introduction:

The problem I am trying to solve with this project is the classification of road signs by leveraging the power of CNNs, something with particularly useful applications in the field of autonomous vehicles. The research paper I am reimplementing can be found here. The objectives of this paper included both the simultaneous detection and classification of traffic signs, however, this project will limit its scope to only solve the classification of these traffic signs.

Methodology:

The my final implementation followed the modified LeNet Architecture described in this blogpost (with some minor tweaks made myself) which consists of:

1st Convolution layer, 32 filters, kernel size of 1x1, relu activation

2nd Convolution layer, 32 filters, kernel size of 5x5, relu activation

Max pooling, pool size of 2x2

3rd Convolution layer, 32 filters, kernel size of 5x5, relu activation

Max pooling, pool size of 2x2

Flatten layer

1st Fully connected layer, output size of hidden_dim1 (1024), relu activation

Dropout layer, dropout rate of 0.6

2nd Fully connected layer, output size of hidden_dim2 (512), relu activation

Dropout layer, dropout rate of 0.6

3rd and final Fully connected layer, output size of num_classes (43), softmax activation

Results

My model was able to achieve ~95% accuracy on the testing data after 10 epochs; as mentioned in the research paper, others have been able to achieve over 99% accuracy using the same German Traffic Sign dataset.

Challenges

The only major roadblocks I encountered when working on this project were related to finding and working with the road sign datasets. Initially I wanted to work with the same data set used in the original research paper, the Chinese Traffic Sign Dataset, but found it a bit too cumbersome to preprocess. The research paper made note of similar projects using the German Traffic Sign Recognition Benchmark dataset which were able to achieve very high accuracy in both detection and classification (nearly 100%), and I was also able to find other papers and blogposts using this dataset, so I decided after my second checkpoint to pivot to using this dataset instead. As mentioned in my second reflection, I had planned to eventually find a U.S. road sign dataset to use, however I had no luck finding a free dataset that wouldn’t require significantly more preprocessing and manual labeling.

Reflection

I believe my model met and exceeded my initial expectations. I think I was a bit too conservative with my base, target, and stretch goals (65%, 70% and 85% respectively) as I was basing my expectations on the ~88% accuracy that the research paper was able to achieve with a different, more complex, dataset while also doing simultaneous detection and classification. The research paper mentions projects that have used the German dataset I eventually transitioned to which were able to achieve more than 99% accuracy. I was initially concerned that somehow my model was overfitting given such high accuracy, however I took steps to avoid this, adding multiple dropout layers and reducing the learning rate, and I also saw that the training and testing accuracy were reasonably correlated (i.e. didn’t have high training accuracy & low testing accuracy, both grew at similar rates with each epoch). This along the fact that others have been able to achieve nearly 100% accuracy using the same data made me think that the 95% accuracy I was able to achieve was not unreasonable. My approach did not change too significantly over time, aside from the previously mentioned pivot in dataset selection. One change I did decide to make was to follow a different architecture from the one described in the original research paper, instead following the modified LeNet Architecture described in this blogpost. Given others using this same dataset were able to achieve near perfect classification accuracy, given more time I would modify my model further to increase my accuracy to be closer to 99%; however, I am more than happy with the 95% accuracy I was able to achieve with my implementation. Working on this project I was able to gain a deeper understanding of CNNs and their practical applications in both image identification and recognition. I also was able to discover and learn a number of very useful tensorflow apis that greatly streamlined my implementation process, namely the use of sequential, compile, and fit, as well as the tensorboard library for visualizing loss and accuracy.

Log in or sign up for Devpost to join the conversation.