Skin_Cancer_Recognition_Tool

Inspiration

The HackUSF AI in Healthcare problem statement instantly caught our attention. As a team, we both share a deep empathy for cancer patients and wanted to use our technical skills to make a real-world impact. Skin cancer is one of the most common—and most treatable—types of cancer if caught early. We aimed to build a practical tool that could assist in the early detection and classification of skin cancer using both medical images and clinical data.

What it does

This project is a two-phase skin cancer recognition tool that allows a user (e.g. physician or researcher) to: -Phase 1: Upload a skin lesion image The AI model classifies it as benign or malignant with 85% accuracy, using a MobileNetV2-based deep learning model trained on Kaggle’s skin cancer dataset. -Phase 2: Enter clinical information (if the lesion is malignant) The system prompts the user to enter patient data (age, gender, race, tumor grade, prior malignancy, etc.), and uses a Random Forest classifier trained on the MINDS clinical dataset to predict which type of skin cancer the patient is most likely to have. The final result helps narrow down possible diagnoses and could assist physicians in planning further analysis or treatment.

How we built it

Phase 1: A deep learning classifier was trained using TensorFlow and MobileNetV2 on Kaggle’s skin cancer image dataset. It outputs a binary classification: benign vs. malignant. Phase 2: Queried and filtered the NIH MINDS clinical database using the med_minds Python package. Extracted and preprocessed patient data for 5 major skin cancer types. Built and trained a RandomForestClassifier using scikit-learn to predict cancer type from patient information. Developed a command-line interactive app that takes image and clinical inputs and performs both phases sequentially.

Challenges we ran into

nderstanding and querying the MINDS PostgreSQL database via Docker was a huge learning curve. Dealing with real-world medical data: it’s complex, messy, and required heavy preprocessing. Managing file size limitations on GitHub (e.g. MINDS datasets >100MB). Figuring out how to bridge the gap between two completely different datasets—image-based data and structured clinical information—was tricky but rewarding.

Accomplishments that we're proud of

Successfully built an end-to-end dual-model AI application that integrates two independent data modalities. Learned how to work with complex medical data and train custom models under pressure in our first hackathon. Overcame technical blockers related to Docker, SQL, dataset merging, and model compatibility. Built something real and meaningful that could actually help physicians or researchers.

What we learned

How to design and train deep learning and machine learning models for medical applications How to work with real clinical data and query it using SQL How to create an end-to-end AI pipeline that transitions from image classification to clinical reasoning Most importantly, how to break big problems into solvable pieces—and persist through the roadblocks

What's next for Skin_Cancer_Recognition_Tool

Further research into correlating image features with clinical features to create a multi-modal classifier. Expanding the model to support more cancer subtypes by integrating more records from MINDS. Improving the UI/UX: building a web or desktop interface for physicians to interact with the tool more intuitively. Deploying the app in a sandbox environment where users can upload images, enter clinical details, and receive AI-driven diagnostic support in real time.

Built With

command
docker
github
interface
jupyter
keras
line
matplotlib
med-minds
notebook
numpy
pandas
postgresql
python
scikit-learn
seaborn
tensorflow

Updates

Dat Ho started this project — Apr 06, 2025 10:20 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.