We are trying to classify various images of skin that have lesions to determine what type of melanoma is present in the image. We found a good data set for this task, which is the primary reason we are trying to pursue it. This is considered a classification task.

We are choosing option one to implement a project that already exists and applying its structure to a different dataset. In fact, in this implementation we would utilize an architecture that is similar in structure to the research paper. In this paper, we recognize that skin cancer detection takes place at rather early stages and a deep learning model can help in recognizing it. The model utilizes a set of layers that help a dataset learn whether the picture that is being presented is skin cancer or not. Similarly, this is related to our own project as we are pushing towards demonstrating a model that can differentiate whether the model is able to tell if the cancer is benign or malignant. This model is going to use a CNN network which similar to the research paper in architecture.

Data Melanoma dataset from Kaggle sourced from the International Skin Imaging Collaboration The data has been preprocessed and the labels have been given to us in CSV format, so there is not much preprocessing to do on our end The dataset is already split into train and test sets with 37,628 test images and about 11,000 test images.

Methodology: What is the architecture of your model?

The architecture of our model is 1. Batch_normalization 2. Conv2D1 3. Conv2D2 4. MaxPooling2D1 5. Dropout1 6. Conv2D3 7. Conv2D4 8. Dropout2 9. Flatten 10. Dense1 11. Dense2 12. Softmax We will be using this architecture to train our model. I think the hardest part about implementing the paper we found will be getting the Area Under the Curve (AUC) rates that the paper reported because we will be using different data. The AUC reported in the paper was 99.77%. Hopefully the same results will persist in our implementation.

Metrics: We will be using AUC as a metric of our success. Our base goal is 60%, target is 70% and reach is 90%.

Ethics: Why is Deep Learning a good approach to this problem? For medical diagnoses, we only care about the result rather than the approach a model took to get there. Thus, a deep learning model is suited because as long as the output is accurate, we do not need to know the approach the model took. Who are the major “stakeholders” in this problem, and what are the consequences of mistakes made by your algorithm? Patients and their families, hospitals, insurance If our algorithm was to make a mistake, someone would have an inaccurate medical diagnosis and treatment. Insurance companies would get upset at our algorithm. Patients and families will be wasting time and money. Some concerns we have about our data set is that we do not know how representative the data is of all populations. We know that it was collected in one hospital and we do not know anything about the diversity of the patients.

Division of labor: Briefly outline who will be responsible for which part(s) of the project. Anish: responsible for implementing the CNN structure with Aayush Preeti: responsible for preprocessing and helping implement the rest of the model structure Aayush: responsible for implementing the CNN structure with Anish

Built With

Share this project:

Updates