Heart Failure Classifier

Classifies cardiovascular risk with various health metrics

Feature Selection
Overall Model Metrics
AdaBoost Hyperparameter Tuning
AdaBoost Model Performance
Risk Assessment
K-Means Model Performance
Error Visualization
Data Wrangling

Inspiration

As one of the leading causes of death globally, we need to be able to assess the risk of an individual for Cardiovascular Disease in order to help prevent further development that might end their life.

What it does

We developed a model that classifies individuals for whether they will develop a cardiovascular disease and additionally provides a probability for severity and future risk assessment.

How we built it

First we began with a k-Means model to justify the spatial separability of the data. After verifying that our features were spatially relevant, we used more advanced decision trees in this case Random Forest and then we boosted our model with AdaBoost for greater performance.

Challenges we ran into

Feature selection is a very difficult challenge and tuning the hyperparameters for each model takes a lot of time, so picking the right features that in our experience are the most impactful let us achieve these results in a manageable time. Another challenge was getting our model to improve by using AdaBoost and unfortunately the results were fairly similar between Random Forest and AdaBoost; however, on the bright side, the results were still exciting.

Accomplishments that we're proud of

We really did well developing the model and we are most proud of how much we were able to accomplish given such a short timespan, and how our model could potentially help millions of people that could suffer from cardiovascular disease.

What we learned

We learned a lot about hyperparameter searches and data manipulation in order to plot the data in a way that is not only aesthetic, but also informative. We learned about the quality of data and how some datasets need far more work than others to get the data in a form that we can analyze and

What's next for Heart Failure Classifier

The next steps are hypothesis testing to remove features that are less relevant to increase the accessibility of the model for use in developing countries where certain tests may not be widely available. We would also like to engineer features to better separate the data spatially alongside testing a wider range of support vector machines and decision trees to see which would perform better.

Built With

kaggle
pandas
python
seaborn
sklearn

Updates

Dillon Marquard started this project — May 16, 2021 01:13 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.