SkinDeep

Inspiration

Melanoma is the most common form of contracted cancer in the United States, with one in five Americans contracting some form of the disease during their lifetime, and 5 million Americans being treated for it each year. However, melanoma, although fatal if caught in late stages, is extremely treatable if caught in early stages. However, seeing an oncologist or dermatologist for suspected cancer can often be ignored due to cost reasons, a lack of insurance, or for personal and private reasons. Thus, SkinDeep is an application that can give individuals a private, cheap, and simple way of predicting whether a skin lesion might indeed be cancerous, and if so, schedule a meeting with a physician to confirm.

What it does

SkinDeep is a Web Application that uses machine learning to give individuals the power to classify skin lesions as cancerous or not. After taking or submitting a picture, a self-created algorithm extracts relevant features from the image and returns a classification of the mole as potentially cancerous or not.

How we built it

We developed SkinDeep using the React, WebPack front-end framework . The image is sent to a server for feature extraction, where 3 self-developed image processing algorithms extract up to 9 relevant features from the image, including eccentricity, symmetry, and color variation. The image is then sent to Microsoft Azure servers and analyzed by a Two-Class Support Vector Machine, an algorithm that was trained on a public set of relevant dermatological images. The algorithm returns its prediction of whether the mole is cancerous or not. The website is hosted using Node.js while the image processing and machine learning is conducted in Python using the SciKit Learn, SciKit Image, and OpenCV image processing and machine learning libraries.

Challenges we ran into and what's next for SkinDeep

The greatest challenges we faced were directly related to feature extraction and image processing. Because the team was sure that with time limitations and the computational resources available to us, that using a 2-Class SVM was the best model, accuracy in the model was mainly a function of relevant feature extraction. We spent hours reading papers and talking to physicians in the practice about what they look for in their own predictions to understand what visual cues might be indicative of cancer growth.

Medical Image Classification is such a rich field to explore. The team really enjoyed our introduction to the field and have many future possible directions. We would like to be able to classify more types of cancer to aid in the specificity of diagnosis, as melanoma is not as simplistic as a “yes or no” question, but has many different subtypes, of various medical concern. To do this, we would like to make use of a deep learning framework such as Theano or Caffe, or build our own Convolutional Neural Network for image analysis. These networks, pioneered by Geoffrey Hinton at the University of Torono and Yann Le Cun at NYU, have been shown to have high accuracy levels in ImageNet competitions and could be feasibly applied to medical diagnosis. As is often said, reaching 60,70, or even 90% accuracy using these models is technically challenging, but not impossible. However reaching higher becomes a significant challenge. We want to understand that last 10%.

Additionally, with regard to feature analysis, we would compile a large list of 100+ features that might predict function, and use Principal Component Analysis (PCA) to determine which features are most important in image detection.

We also would like to see if similar methods can be applied to non-JPG/BMP/TIFF/PING type images, e.g. MRI scans and Computed Tomography imaging. These images sources might actually be easier to analyze due to the lack of “noise” inherent in digital cameras. However, they present their own challenges due to different relevant features.

Finally, we want to add relevant features to the application, such as confidence level in our predictions and give the user timely statistics about the growth of their lesions that could also indicate growth into cancerous moles. Finally, integrating the application with other applications, such as those that allow teleconferencing with doctors, would be helpful to integrate automated visual diagnosis into the primary method of care.