Inspiration
Growing up, one of our team members spent seven years living in India, where access to dermatological care was essentially nonexistent for most people. Not because people didn't care, but rather because specialist visits were just too expensive, skin conditions that could have been identified and treated early in the United States were allowed to worsen. That experience never left him. When we learned that 1 in 5 Americans develop skin cancer in their lifetime and that a basic dermatologist visit costs $300 without insurance, the parallel was impossible to ignore. The same access problem exists right here. Melanoma caught early has a 99% survival rate. Caught late, that number collapses to 27%. People are dying from something preventable, and the barrier is almost entirely economic. We built Deep Skin because we refused to accept that early detection should be a privilege.
What it does
Deep Skin lets anyone upload a photo of a suspicious skin lesion and receive an instant AI-powered risk assessment, classifying it as LOW, MEDIUM, or HIGH risk along with a plain-English explanation of what was found and what to do next. The app displays a Grad-CAM heatmap showing exactly which regions of the lesion influenced the prediction, a breakdown of ABCDE diagnostic criteria, and a multi-class probability breakdown across seven lesion types. Every scan is logged in a clinical dashboard and can be exported as a PDF report the patient can bring directly to a doctor.
How we built it
We started by building a Flask backend that loads five pretrained CNN models—ResNet50, EfficientNetB5, VGG16, MobileNetV2, and a custom handcrafted CNN—each fine-tuned on the ISIC skin lesion dataset. We implemented Grad-CAM using PyTorch gradient hooks on the final convolutional layer of each model, which required disabling all inplace ReLU operations to prevent gradient graph conflicts. Qwen2.5-VL runs locally on AMD GPU via ROCm and generates the plain-English clinical explanation for every prediction. We then built a full clinical dashboard in HTML, CSS, and JavaScript with scan history, a model consensus panel, Grad-CAM heatmap display with hotspot annotations, and a PDF report generator using jsPDF.
Challenges we ran into
Getting Grad-CAM working across all five architectures was the hardest problem we faced, since VGG16's inplace ReLU operations kept breaking the gradient computation graph entirely. Running Qwen2.5-VL on an AMD GPU through ROCm introduced Hugging Face compatibility issues that took significant debugging to resolve. Building a UI that felt like real clinical software rather than a hackathon prototype under time pressure was harder than we expected.
Accomplishments that we're proud of
We are most proud of getting the full pipeline working end-to-end—six-model inference, Grad-CAM, Qwen2.5-VL explanation, and PDF export—all running locally with no cloud dependency. The model consensus panel, which shows how each model voted individually and explains conflicts in plain English, is something we have not seen in any other student project.
What we learned
We learned that the hardest part of medical AI is not the model—it is everything around it. The interface, the explanation, and the transparency of the output determine whether a patient trusts the result or ignores it. We also learned how to work with multimodal vision-language models on AMD GPU hardware and implement Grad-CAM across heterogeneous architectures.
What's next for DeepSkin
Longer term, we want to partner with free clinics and community health organizations to deploy Deep Skin as a first-line screening tool for uninsured patients.
Log in or sign up for Devpost to join the conversation.