When I heard the tragic news two years ago of my close family friend passing away due to cancer, I knew I needed to act. Diligently, I spent hours analyzing complex research papers on the various types of cancer, its causes, the treatments, and scientist's numerous attempts to find the cure for this disease as it has taken away countless lives. Additionally, I wrote comprehensive outlines on my newfound cancer research knowledge in order to someday code the cure for cancer. In the present day, our world is plagued with a new deadly virus that has also taken many lives, COVID-19. The dire need to create an equitable, accessible, and sustainable solution to aid with those more susceptible to COVID-19 has never been more prevalent. Cancer patients, in particular, are among those who are at high risk of serious illness from the infection due to weakened immune systems by cancer and its treatments. In the realization of these scientific facts, I desire to combat COVID-19 through developing a convenient device that revolutionizes the way we can quickly diagnose for cancer before its too late. I also now realize that life is so short. Every day, I wish I could go back in time to see my friend again to talk with her about my passions for writing, piano, hiking, and much more. I can't change the past. However, I can change the way we diagnose cancer patients in an accessible manner scientific researchers and even patients themselves can easily utilize.
What it does
RapidCare uses features from the University of California, Irvine's Machine Learning Repository Breast Cancer Wisconsin (Diagnostic) dataset. Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe the characteristics of the cell nuclei present in the tumor image. Some of the features include. but are not limited to the average radius, texture, symmetry, and more diagnostic features. RapidHacks takes these features, inputs them in a logistical regression machine-learning algorithm, and outputs a prediction if breast mass image features are malignant (cancerous) or benign (non-cancerous).
How I built it
For RapidCare, I used Python for the machine-learning model. I utilized a lot of libraries such as sklearn, NumPy, pandas, pickles, and seaborn to aid with data cleaning and complex mathematical computations. I built the model using fundamental machine-learning processes which include splitting the data into two parts for training and testing, analyzing overfitting (underestimating the true error rate of the model), and then fitting the data into a logistic regression model. I used a logistic over linear regression because this dataset was categorial by specific diagnosis for malignant (M) and benign (B). When given new features to predict a label for, it outputs a probability of either a 0 or 1. 0 represents a malignant tumor whereas 1 represents a benign one. Then, I converted the logic regression model into a byte stream using Python's "pickling" method within the pickles module in order to properly import the model in my Flask web application. Through commands, Flask effectively loaded the model after running "flask run" in the command terminal. I designed the web application using HTML5 and CSS3.
Challenges I ran into
A significant challenge I ran into was getting the CSS3 to load with my HTML5 template for Flask. It would not load normally without existing within a folder called "static". I also struggled with cleaning the dataset. There are so many little, hardly noticeable mistakes a coder could make.
Accomplishments that I'm proud of
I'm most proud of my consistency and diligence throughout the entire coding process. I was able to efficiently build a machine-learning model and deploy it using Flask. Prior to this hackathon, I never used Flask before. Through hard work and a lot of Googling, I managed to code a working Flask application that uses my machine-learning model seamlessly.
What I learned
Ultimately, I learned more about critical thinking. Throughout coding this project, I realized that coding is more than just sporadically typing lines of code. I realized and applied the value of the "D.R.Y" software engineering principle, the "think before you code" methodology, and a more solid foundation of the process of machine-learning basics. In a short amount of time, I acquired newfound knowledge of deploying machine-learning models using Flask and how to analyze outliers in a dataset.
What's next for RapidCare
In the future, I plan to incorporate all variations of cancer, not just breast cancer. Additionally, RapidCare will be able to have an image classifier system where a user inputs a picture of a cell body to automatically determine if its a cancerous tumor or not. Despite RapidCare already being efficient for this case. the image classifier will automate the task a step even further. I also desire for a way for users to become more informed about cancer in a convenient, engaging way. Hence, I aspire to code an interactive cancer research journal within the app that highlights confusing words and redefines how we perceive cancer as an incurable disease.