Inspiration
I've worked with various forms of image classification before, and a major difficulty was always tracking down a large enough dataset. Often, finding a dataset is completely impossible for a particularly niche task. This website allows you to train once on a widely available dataset and then reuse that model to perform diagnosis for a different task.
What it does
I used cell stain images as a case study, but it's pretty intuitive to see how this may be applied to other applications. The website allows the user to submit any image of a cell stain, regardless of disease, and receive various forms of diagnostic information. This includes various forms of distance between the submitted image and the normal cell distribution, telling the user whether the proposed cell likely is a normal cell or not. It also includes an anomaly heatmap, showing the exact unusual regions in the submitted image, allowing for smarter analysis of the image. Finally, it includes a similarity feature, showing images that are semantically similar with the submitted image, along with their diagnosis, as essentially an automatic visual "case study" search.
How I built it
The core technology used is a variational autoencoder (VAE). I go into more depth in the video, but it essentially allows me to model a distribution of images, parameterized by a latent variable that, ideally, models the semantic features in the image. The VAE is trained with respect to the evidence lower bound (ELBO) which is composed of negative reconstruction loss (mse is used here) and negative KL divergence between the variational distribution over the latent variable and the prior distribution over the latent variable (standard normal is used here). Those three metrics are the distance measures used to judge whether a given image is in the original distribution or not. The idea being based of Baur et al., the difference between the original image and the reconstruction is used for automatic anomaly segmentation. The image similarity is measured through the cosine distance between the latent variables. The model was trained using Tensorflow and Keras (source is on github). The website itself is hosted with Flask, with Redis handling the queuing of requests.
Challenges I ran into
Because of resource constraints, I had to convert the keras model to a tflite model before I could deploy it. But, the standard normal layer used for latent variable sampling isn't supported by tflite in python. I fixed this by replacing the standard normal layer with the mean after training, though I'm worried that this may affect accuracy.
Accomplishments that I'm proud of
I've never implemented VAEs for an actual use case before, so this was a very valuable experience.
What I learned
I strengthened by skills in various areas of machine learning and a web development.
What's next for Novis
I want to expand the model to other types of images. I also want to clean up parts of the design of the website.
Log in or sign up for Devpost to join the conversation.