Supervised learning is given so much attention in the world of deep learning that we often forget about unsupervised methods.

What it does

I've implemented a version of Deep Embedded Clustering (DEC) in a Google Colab notebook that clusters the MNIST dataset with 89% unsupervised cluster accuracy. This is in contrast to the 52% achieved with PCA to n-dimensions and KMeans. The current design is a variational autoencoder that compresses the input data to 50-dimensional latent space before assigning cluster centres with KMeans and computing KL divergence.

How I built it

With an awful lot of reading, as well as a fair bit of battling with TensorFlow.

Challenges I ran into

I realised 14 hours in I'd built my loss metrics incorrectly.

Accomplishments that I'm proud of

I managed to produce a model that performs better than traditional clustering methods.

What I learned

How to install LaTeX for Windows. How to create custom layers for TensorFlow

What's next for Deep Clustering (with a Conv-Vari Autoencoder) on MNIST

Write a literature review

Built With

Share this project: