REFLECTION
Introduction:
Deep learning has made tremendous progress ever since the inception of AlexNet, and the field is constantly making improvements to solve problems. Regularization is a technique often utilized by researchers for a variety of reasons, such as to prevent extreme weight values and/or to encourage sparsity in weights. Although it is traditionally acknowledged that regularization has the potential to increase accuracy and other metrics, the effects of regularization on explainability writ large have not been explored extensively. This research undertakes a detailed analysis of the effects of different types of regularization on the explainability of several models for image classification.
Challenges:
The largest limitation we have encountered is the sheer amount of computational resources required to properly train models. At first, we had planned on using ImageNet, and training two open-sourced architectures (Resnet50 and VGG19) with multiple regularizers. However, simply training one model in this fashion took over a day! This left very little time for debugging, fine-tuning, or a thorough explainability analysis. Therefore, we decided to modify our methodology to overcome the computational resource barrier. Specifically, we elected to use CIFAR10, as opposed to a larger dataset such as ImageNet or CIFAR100. We also decided to write our own convolutional neural network using PyTorch so that we could have at least one set of models (our CNN with multiple regularizers) that could be trained relatively quickly and tested extensively.
Insights:
We have completed a significant portion of the project already, and are excited to share our results! They can be found at this link: https://github.com/JayRGopal/reg-explain.
The models we have trained, as well as the Colab Notebooks we have written, can be found at this link: https://drive.google.com/drive/u/0/folders/1JDJhca-zm6o2pFi3PdZ1psn39_kc0gc3.
In summary, we have completed the simple CNN, and trained it (with various regularizers) along with multiple off-the-shelf models on CIFAR10. We have made extensive progress on the explainability analysis, and have uploaded the most important qualitative findings to GitHub in the form of images. For example, we selected a representative image of each class and computed the overlaid gradients of each model on those images. This allowed us to show whether regularization helps models “focus” on what humans consider the most important features.
Our expectations were that certain regularizers would increase explainability, while others would not. Notably, we thought that L2, by decreasing the magnitude of the weights, would spread out the model’s gradients, broadening its visual field. We believed that dropout would serve to force the model to learn redundantly, meaning it would focus on multiple important features. We thought that L1, by encouraging sparsity of weights, would decrease overall explainability, as some important areas could have their impact reduced to zero.
Our findings are very interesting, and are not fully in line with our expectations. The Simple CNN’s response to regularization is quite similar to our expectations. For example, L2 regularization seems to dampen the model’s focus on the important areas of images. However, larger models are extremely difficult to interpret to begin with. Even with various regularizers, they seem to remain mysterious. We look forward to continuing this work to be able to make conclusions about regularization’s impact on models such as Resnet50.
Plan:
At present, we are ahead of schedule with our project. We aim to hit our stretch goal, which is to train and interpret a total of 9 models (Simple CNN, Resnet50, and VGG19, each with 3 different regularization conditions). Additionally, due to the ease of training and interpreting the Simple CNN, we have added multiple new interpretability metrics! For this model, we are incorporating Integrated Gradients, as well as DeepLIFT, for an even more extensive analysis of regularization’s effect. Due to insufficient memory on Google Colab, we are unable to run these novel metrics on the larger models at this time.
Thus far, we have had to wait for models to train on OSCAR. Even on the CIFAR10 dataset, large models such as Resnet50 do not train quickly! We would now like to dedicate more time to generating key images for the explainability endeavor. The main changes from our original plan were detailed in the Challenges section. To reiterate, we chose a more manageable dataset (CIFAR10), and added a custom model (Simple CNN) to facilitate fast training.
Citations: ##
Here are the important public repositories we have utilized thus far:
https://github.com/rwightman/pytorch-image-models
Log in or sign up for Devpost to join the conversation.