Inspiration
As someone interested in medicine I wanted to make an application that would make doctors' lives easier. I came across the cancer dataset in Kaggle and though it would be really cool to make a machine learning application that would help doctors classify cancer cells.
What it does
It takes data that would be collected in a cytosis lab and makes a prediction on if that cell/cluster of cells is benign (not super dangerous) or malignant (is super dangerous) based on a logistic regression model. It also shows a radar graph to show what the cancer cell "looks like" in terms of its characteristics.
How we built it
I used Python as the main language and SciKitLearn, Pandas, and Pickle to create the machine learning model on the backend. I then used Streamlit, numpy, Plotly, and Pandas to create the frontend and data visualization.
Challenges we ran into
I am fairly new to machine learning and finding the right model to use was a bit of a challenge. I first tried to do a k-means clustering and LSTM neural network but after doing some research a logistic regression was what aligned with my goals the most. There were also some dependency issues I ran into but that was very minor and did not impact the overall development experience.
Accomplishments that we're proud of
Being able to make an app that would help healthcare workers is something I am very proud of. This was also my first time using Streamlit so being able to learn a new technology was also really cool.
What we learned
In terms of technical accomplishments, learning Streamlit and logistic regression was really cool. I also learned a lot about how doctors classify cancer cells which was really interesting.
What's next for Malignancy Predictor For Breast Cancer
I feel like the model has a scale issue. The dataset from Kaggle was a bit small and I feel like I could get a more accurate model if I had more data to train the model on. Also maybe adding some more data visualizations for what the cancer cell could look like or represent could be a cool thing to add in the future.
Built With
- csv
- kaggle
- machine-learning
- numpy
- pandas
- pickle
- plotly
- python
- scikit-learn
- streamlit
Log in or sign up for Devpost to join the conversation.