thelatent.space - Deep Clustering (with a CV-AE) on MNIST

Generated MNIST digits using random weights
3D scatter plot of clusters in 3 of the 50 latent dimensions
3D scatter plot of clusters in 3 of the 50 latent dimensions

Inspiration

Supervised learning is given so much attention in the world of deep learning that we often forget about unsupervised methods.

What it does

I've implemented a version of Deep Embedded Clustering (DEC) in a Google Colab notebook that clusters the MNIST dataset with 89% unsupervised cluster accuracy. This is in contrast to the 52% achieved with PCA to n-dimensions and KMeans. The current design is a variational autoencoder that compresses the input data to 50-dimensional latent space before assigning cluster centres with KMeans and computing KL divergence.

How I built it

With an awful lot of reading, as well as a fair bit of battling with TensorFlow.

Challenges I ran into

I realised 14 hours in I'd built my loss metrics incorrectly.

Accomplishments that I'm proud of

I managed to produce a model that performs better than traditional clustering methods.

What I learned

How to install LaTeX for Windows. How to create custom layers for TensorFlow

What's next for Deep Clustering (with a Conv-Vari Autoencoder) on MNIST

Write a literature review

Built With

google-cloud
google-colab
python
tensorflow

Updates

Will Humphreys started this project — Nov 17, 2019 05:33 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.