Exemplary Learning

The 4th Floor: Victor Mora, Thomas Summe, William Zhang

Introduction

Deep learning is often touted as mimicking the structure of a human brain. In reality, more complicated networks lose that resemblance and the process of decision making becomes obfuscated behind many layers. We attempt to clarify this by creating a network that will more directly correspond to human thinking via prototypes. Prototypes are a proposed theory for how humans learn to categorize objects. People classify stimuli by comparing them to what they have stored in memory, the prototype. Our network will then contain layers that emulate prototype behavior. One of the reasons we believe this is important is because learned prototypes and exemplars should be extractable and viewable, allowing for better understanding of the model. Additionally, there is ambiguity as to how people utilize these theories in tandem, and evaluating how our model learns might give insight into human psychological correlates.

Methodology

We do an initial feature extraction consisting of two convolutions and max pooling layers which reduces the 28x28 image of MNIST numbers to a 7x7 abstract feature image. From there, instead of using dense layers and softmax, we now use stored prototype 7x7 images for each class and calculate distances between each prototype and the feature image. Then we take the minimum distance among all prototypes for each class and get a vector of distances, which we then softmax to get a vector of probabilities. With that, we can classify. For training, we input an image and use Distance Based Cross Entropy (DCE) and Prototype Loss(PL) which is combined together to minimize the distance of our prototypes and image for the matching classes. Additionally, we implemented and trained a decoder to help visualize what the model learned in its prototypes, by using upscaling and Conv2d transpose.

Ethics

The model we are attempting to create is meant to more accurately represent the hypothesized methods involved in human learning and thought to be utilized in cognition, including use of heuristics and exemplars to classify novel stimuli into discrete categories. If successful, our model might give insight and allow for an advance in the field of metacognition, as well as allow for the potential for better predictions and higher accuracy. In addition, finding mistakes that our model makes could reveal some of the inherent biases made by humans when performing classification, and thus facilitate societal change that could mitigate these inherent biases as they arise in decisions made by companies, governments, and individuals.

We are planning to quantify success by the accuracy of our model on classifying the images passed to it. What makes our model unique is the paradigm by which it makes its decisions on growing existing categories or creating new classifications for novel stimuli. The implications of our quantification are that our model will be incentivized to perform well, constrained by the hypothesized mechanism of human classification. A result of this might be that our model results as an “optimized” form of human classification, or it may fail to satisfy both the human-like constraints and the demand for high accuracy.

Results

We achieved a maximum accuracy 96.36% with this model, which met our expectations. We also managed to somewhat successfully visualize what the model had learned in the abstract feature space with its prototypes. We also noticed a trend that more prototypes per class increased accuracy but with a diminishing effect.

Challenges

We encountered difficulty in visualizing the prototypes created by our model. We were hoping that the visualizations of the prototypes would help us to corroborate the cognitive theories about how humans learn that we researched in making this model, but our findings on this front still remain somewhat inconclusive. An issue is that these are just representations of what the model learned in an abstract feature space, so a visualization would not necessarily look like the number but might only contain the most important points that represent a number.

Reflections

We feel that our project turned out well. While it wasn’t quite able to meet all of our visualization goals, the goals that we set for accuracy were well met. We were able to meet our implementation and accuracy goals (target and stretch, respectively). Our model worked as we expected it to, except to the degree to which we were able to visualize our prototypes. Over time, our approach became more concrete as we began implementing the model, but we did not make any significant pivots. If we had more time, we would like to extend our model to the CIFAR data set, as well as add probabilistic layers as well as attempt to use SSIM again in lieu of MSE for image comparison with higher accuracy. Our biggest takeaways from this project are that a model based on a human learning mechanism is able to produce viable results and meet benchmarks for accuracy, and we learned not only cognitive theories of human learning, but also how these can apply to deep learning models.

Check-in 1:

https://docs.google.com/document/d/1zTdIFwQYE57G0wAklvf1iT8HIwGEq4kJIC2jWOIHZkA/edit?usp=sharing

Check-in 2:

https://docs.google.com/document/d/1gbBK9ra2Rmqgmoyut8p9RNlCs9PyB2nFMRYBzbz0H2E/edit?usp=sharing

Final Writeup:

https://docs.google.com/document/d/1ykB7PpFzPykuH4DRZFWJ8Vv2zAflEcWIJCmBPnG820s/edit?usp=sharing

Built With

Share this project:

Updates