Granular health data is available across a network of apps, but there is no centralized system to generate a quantified, holistic image of your physical health. So, although users are being proactive in their efforts to quantify themself and monitor their health, the sheer amount of data and software they interact with doesn’t provide a quality user experience.
What it does
So for the user workflow. First part is you log in users, see a dashboard, and this dashboard has all of the most important health metrics for that user. At that point in time. And saying that point in time is crucial because what it means is everything is up to in real time, there's no lag. When they wake up, they'll be able to see what the sleep metrics were from the night before, and that will update as their, their day continues. On the left side, you'll see a holistic score. And this holistic score is how your health is doing based on five categories that we've created blood pollution/air quality, sleep, exercise, and diet. And we're pulling data from five different sources for those five categories. But that wheel is that scoring artificial intelligence looks at these different models and then takes some average to give you you that number.
The second part of the AI, which is important, which is how can we recommend you different ways to increase that score. On the right side, in the personal recommendation tab, all of those recommendations are based on the holistic score. So if you have a score of 99% in diet, then you're unlikely to have a general recommendation there to optimize your diet or have some new type of food. Whereas if you have a 66 in sleep, then we're likely to give you a recommendation about listening to this N SDR playlist or having magnesium pills two hours before you go to sleep or being able to turn off the lights at 10:00 PM and start to allow your melatonin to count, to increase inside your mind. So we take that data from the scoring AI. Then we use it to create recommendations. And once you check off the recommendation, that's where we then update your score.
And so that's just some simple logic. Once this is checked off, increased by N and N is different based on the weightage of the recommendation. So if the recommendation has, has something to do with getting more sleep versus having magnesium pills, then maybe getting more sleep, or you actually doing the job of going into bed and taking the nap to recover from sleep debt that has higher weight than just taking the pills. If it has higher weight, you're not going to get the chance to increase the score more with that recommendation than the previous one. And then to the right, we have a section about quick tips. These are time sensitive tips. So in the morning, you'll see tips that are specific to the morning, same with afternoon, same with evening and same with late night. And so you, as the user can log in on either of those four occasions, or you can log in once in the morning and see what the tips are for the morning and where we pulled these tips from, we just looked at the best in the world for healthcare. So we looked at people like in Andrew Huberman, who has a podcast. So we parse through those transcripts and we also look through his tweets and he has tweets about getting early morning sunlight, ensuring that you delay caffeine by 120 minutes after you wake up.
And those are general health tips that again are applicable to the population, but those are still high quality tips that are, are still filling on obvious to most people. And then you'll also see at the bottom, the medication section, and there's one main intention for this, which is you have one cohort of people, senior citizens, for example, that , aren't taking their medications, or you have another cohort of people that don't know when to take their medication, or as the day continues, they forget to take the medication. So essentially this current medication section is helping to prevent really simple mistakes that can drastically decrease the score. If you don't take this pill, your score will reduce by four or five. And then the last part is some history chart that you can see about your personal progress. How is my exercise changing over time?
How is my nutrition changing over time? And that simple data visualization now gives you some metric to look at here's where I started. Here's where I am now. And then what's, what's great about that is you can export that as a CSV export that as a PDF, and then you, you can have the rows and columns, just a bigger data frame of here's. How many minutes I exercise? Here's here's how that increased my score. Here's how minutes I slept. Here's how that increased my score. And then you can give that to doctors. You can also use it yourself as you export it, and you can see what factors are most important. Although the machine learning algorithms are already going to do that for you.
How we built it
What the Health uses Next.js on the backend, a Flask API to deploy the machine learning models, HTML/Tailwind CSS/React for the frontend, and CockroachDB for frontend-backend integration and server connection.
The machine learning began with 5 XGBoost decision trees to generate a ‘feature score’ for each input data type. The 5 models representing blood reports, sleep, exercise, nutrition and environmental factors were then aggregated into one final score out of 100.
The personal recommendations were generated by first analyzing the ‘gain’ of each input feature. The ‘gain’ metric analyzes the improvement in accuracy for each input feature, outputting a list of the most valuable input features for that XGBoost model.
Aggregating and comparing the most valuable features of all the XGBoost models give us the top 3 highest-value recommendations for users to tackle. These are the input features that when solved, will result in the biggest increase in the final score out of 100.
Challenges we ran into
On Sunday morning, we ran into a technical problem where we weren’t able to train the generator because of the limitations of traditional neural networks. We couldn’t backpropagate over the XGBoost models to train the generator weights and biases.
Currently, neural networks are trained with an input value that has random noise and it tries to map to n distinct outputs. We had 40 distinct outputs because that is the amount of ‘levers’ or variables we can move at the end of the neural network training.
At this time, we had two paths.
The first is, find a new, different loss function that was differentiable from our current loss function. On this first path, we would have had to sacrifice the ensemble of XGBoost trees we have and letting go of these models would defeat our project’s purpose.
The second is, switch entirely off of XG Boost, we could extract the vectors of importance that were more inline with the scores people would be recieving. This required building a new neural network architecture for our previous 5 machine learning models.
At this point, we began looking at other machine learning techniques like transfer learning. All other options failed.
So, we instead, opted to prioritize XGBoost because you can visualize each input feature and understand how it thought about weighing certain outputs over others. With current neural networks, you’re unable to generate explainable recommendations, so neither the developer (us) nor the user would understand how the network prioritized an output for the result.
What we learned
Complexity shouldn’t be increased unless required. For What the Health, extreme gradient boosted trees were much more effective than a neural network. The clear interoperability as well as decreased training time enabling faster scalability, heavily outweigh the neural network.
Trust is key when building in the healthcare space, and using XGBoost allows us to explain to the user exactly where their recommendations are coming from, instead of being stuck with a black box. What the Health can clearly explain why the input features were selected as the highest value by referencing the ‘gain’ metric of each, which is then combined with the personalized goal within that metric to output a recommendation.
XGBoost models perform better than neural networks in almost every criterion. The models are exponentially faster, taking only 10s in total to train all 5 models. They are cheaper, faster and what’s most important: more accessible. All of the above criteria enable cheaper and faster scalability and even the ability to re-run the models every single day to ensure the most updated data is being delivered to the user.
By decreasing the hardware requirements of this application, What the Health is expanding the domain of impact of AI and data analytics as a whole
What's next for What the Health?
We’re excited about the idea of aggregating data from multiple sources to give users one central page to access and update their health information through. In the future, we’d like to use
- Integrate Apple’s HealthKit API with web app: There is an existing cohort of 100M+ users who have their data in the app but aren’t getting the benefits of our generator and scoring AI to take their health from something they observe on an app to something that actively provides guidance on how to optimize for improved physical health.
- Integrate with MyFitnessPal: We applied for the MyFitnessPal API so we can track caloric intake, macronutrients from meals and use that data to share nutrition deficiencies or other nutrient options with the user.
- Discovering an economic incentive to commercialize What the Health: Aspiring or current professional athletes are an intersting initial target market because of their current priorities. This user group is already a superuser of wearables, health tracking apps, and personal trainers; working with them first gives our models a large dataset of real athlete data.