My inspiration for this build is my childhood struggle with being overweight. Often it was difficult to lose weight because it was too much of a sudden change but finding the most important attribute could make that process easier because it would be a slower more gradual change but since the most important are being changed the effect it still apparent.
What it does
To create a model to accurately predict someone's weight class and find the factors that were most important in making that guess
How we built it
Using a Random Forst classifier I attempted to correlate genetic traits including: -Age -Height -History of Obesity -Gender
In addition, I also tried to correlate lifestyle choices such as: -Calorie Consumption -Number of main meals -Cups of Vegetables Consumed -Technology use in hours -Amount of phyiscal activity -Whether snacks were consumed -Calorie Tracking -Cups of Water Drunk -Consumption of Alcohol
Datasets: Using 2 different datasets one that included lifestlye choices and genetic traits and one that only incldued lifestyle choices I trained two different models. One was based on 13 diffrent traits that included genetic and lifestyle traits to find correlation between these factors and obesity. The second model was based only lifestyle traits with 9 different traits.
Model: I used Random Forest Classifier because it is easy to use but still extremely accurate and has a lot of documentation on the internet.
Results: Model 1: Lifestlye Traits Accuracy: 87.7068% Model 2: Lifestyle + Genetic Traits. Accuracy: 73.5224% Model 1's higher accuracy shows that there is significant correlation between a person's weight and their genes. However Model 2 proves that there is also a somewhat weaker but still strong correlation between one's lifestlye and their weight.
The most significant factors for Model 1 were (in order):
-Age (∼15%) -Vegetables (∼14%) -Height (∼13%) -Main Meals (∼8%) The most significant factors for Model 2 were (in order):
-Vegetables (∼20%) -Water (∼16%) -Technology (∼16%) -Physical Activity (∼15%) Using this data we can derive that one of the most important lifestyle changes that one could make is eating more vegetables.
Challenges we ran into
One of the biggest challenges I ran into while making this build were finding the dataset as there a hundreds of datasets on there on this one topic and finding one that checks all the boxes is extremely hard. However even this dataset was not perfect and there was a significant amount of editing I had to do to the dataset to make it more accurate.
Accomplishments that we're proud of
I am proud of creating a model with an accuracy of almost 90 percent showing the strong correlation between these attributes and health. In addition, I am proud of the potential positive affect that my model will have to people who struggle with weight loss.
What we learned
I think the technologies I learnt during this journey were the visualization of data through confusion matrices, histograms, and plots using libraries such as sklearn and matplotlib.
What's next for Causes of Obesity
What's next for Causes of Obesity is finding if these attributes make a difference because correlation may suggest causation but not imply it.