What it does

This project utilizes three machine learning algorithms in order to predict hospital readmissions in order to minimize additional costs of hospitals.

Algorithms Used

Multinomial logistic regression with Lasso penalty (most accurate)
Ridge logistic regression
Support Vector Machine
Random Forest

How I built it

The model training was completed in Python, while R was used to visualize and process the data.


The training data contains 56000 patients’ records on 50 variables.
Between 2235 and 2291 features identified for classification, with 250 features actually being used.
Five-fold cross-validation was conducted.

Predictions and code are located in the linked GitHub repository.

Share this project: