Diatree: A Random Forest Model for Predicting Diabetes Risk

Inspiration

We were inspired by the data in the US healthcare dataset, and many of us have relatives with diabetes.

What it does

This model predicts a patient's risk of diabetes based on body measurements and glucose levels that are taken during doctor's appointments. It is intended to help doctors improve communication with their patients.

How we built it

We built this model using Python and the models available in the sklearn package.

Challenges we ran into

It was very difficult to clean the dataset, especially dealing with NaN values.

Accomplishments that we're proud of

We're really proud that we were able to create a ML model in such a short period of time, and that we decided the objectives of the model, etc. ourselves.

What we learned

We didn't expect data cleaning to be as difficult as it was, so we learned more efficient ways to handle that. We also learned a lot more about different ML models and how to implement them.

What's next for Diatree: A Random Forest Model for Predicting Diabetes Risk

We're planning on improving the accuracy (overfitting, etc.) of the model, and making it more available by creating a GUI. We're also planning on expanding the model to other illnesses as time permits.