Inspiration
It always take gut to classify gut diseases using machine learning model!
What it does
A multi-label classification model to classify different diseases based on the microorganisms in the gut microbiome.
How we built it
We used extra-trees model with hyper-parameter tuning and feature selection.
Challenges we ran into
Initially, we had the problem choosing a good model. After we normalized the data to deal with the feature imbalance problem, we thought we would use a Gaussian Naive Bayes. However, the accuracy was only about 0.33. We suspect that the gut bacteria features may not follow the Gaussian distribution. Then, with a very helpful mentor @anolin on discord and reading some scientific literature on machine learning models for gut bacterias, we finalized on two models: Gradient Boosting Classifier and Extra Trees Classifier. We tried both, and found the latter to have lower bias with about the same level of variance with the former. Thus, we settled down with Extra Trees Classifier。
Accomplishments that we're proud of
Heehee, we are proud of everything we did :) Especially the good-performing f1 and kappa score *(check out our colab code with all of the results)! *
What we learned
We get to explore a bunch of different cool machine learning models!!!
What's next for It Takes Gut to Classify Diseases!
If we have more time, it would be nice to explore hyper-parameter tuning more. Additionally, it would be cool if we can try other data normalization techniques (other than min-max)
Built With
- colab
- python
- scikit-learn
Log in or sign up for Devpost to join the conversation.