Inspiration

Breast cancer is one of the most common cancers among women worldwide. Early detection is critical for improving survival rates. I was inspired to build this project to explore how machine learning can assist in early detection of breast tumors and to demonstrate an end-to-end ML project in healthcare.

What it does

The Breast Cancer Prediction Model predicts whether a tumor is Malignant (cancerous) or Benign (non-cancerous) based on 8 key medical features. Users input the features into the app, and the model outputs a probabilistic prediction. While it cannot replace a professional diagnosis, it serves as an educational tool and proof-of-concept for AI in healthcare.

How I built it

I built the Breast Cancer Prediction Model using the Breast Cancer Wisconsin (Diagnostic) Dataset from the UCI repository. The dataset originally contained 30 medical features per tumor, but I selected 8 key features (like mean radius, mean perimeter, and mean area) for simplicity and usability. I preprocessed the data by handling missing values, removing duplicates, and scaling the features. I trained multiple machine learning models including Logistic Regression, Random Forest Classifier, and an optional Neural Network using TensorFlow/Keras. The Random Forest Classifier was chosen for deployment due to its robustness, ability to handle non-linear relationships, and high accuracy with minimal overfitting. The trained model was serialized using Pickle and deployed through a Streamlit web app, where users can input tumor features and instantly receive predictions.

Challenges I ran into

One of the main challenges was reducing the original 30 features to only 8 while still maintaining high prediction accuracy. Another difficulty was minimizing false negatives, which is critical for medical predictions to ensure that cancerous tumors are correctly identified. Additionally, designing a simple and intuitive user interface for non-technical users required careful thought to balance usability and functionality. Ensuring that the web app displayed results clearly and accurately while remaining visually clean was another hurdle I overcame.

Accomplishments that I'm proud of

I am proud to have built an end-to-end machine learning project that achieved ~97–99% accuracy. The project demonstrates my ability to preprocess data, select features, train and evaluate models, and deploy a fully functional web application. I also successfully created a user-friendly interface that makes it easy for anyone to use the tool. Overall, the project is portfolio-ready and serves as a strong demonstration of practical AI/ML skills in healthcare applications.

What I learned

Through this project, I gained hands-on experience in data preprocessing, feature engineering, and model training. I learned how to implement and compare multiple ML models, interpret feature importance, and deploy models as interactive web apps using Streamlit. I also realized the importance of balancing accuracy, usability, and interpretability in healthcare machine learning applications. Finally, the project improved my problem-solving skills and understanding of end-to-end AI pipelines.

What's next for Breast Cancer Prediction Model

In the future, I plan to experiment with more advanced models like Gradient Boosting or XGBoost to potentially improve accuracy. I also aim to enhance the app with real-time visualizations and interactive dashboards. Expanding the model to include additional features and possibly deploying it on mobile platforms will make it more accessible. Overall, my goal is to continuously refine the model while keeping it simple and practical for real-world healthcare applications.

Built With

Share this project:

Updates