Model design (scheme)
Ground model (latent variables allowed)
Transition model (latent variables allowed)
Ground model parameterized
Transition model parameterized
Personalized prediction of disease progression
Alzheimer's disease (AD) is a major health challenge worldwide. Its pathology is unclear, and currently available treatments are not effective. AD has lots of comorbidities and risk factors, and medications used for these conditions may be helpful against AD. Therefore it is helpful to study the causal relationships among AD and its comorbidities.
What it does
The model describes the causal relationships among patient characteristics, cognitive decline, AD, and comorbidities of AD. It is also capable of predicting the disease progression given complete/incomplete patient information.
How I built it
The continuous variables were normalized before analysis. Since the data set is longitudinal and the number of observations was different among patients, I built the model in two parts: 1. a ground model that links the baseline measurements with the patient characteristics; 2. a transition model that predicts the measurements of the next time point given the current time point, under the Markov's assumption. Then, the RFCI algorithm was used to search the two datasets for these two models. I added 4 tiers in the prior knowledge based on the time sequence of the variables, but I did not add any required or forbidden arrows. Next, in order to build a parametric model out of the learned causal structure, I discretized all continuous variables and randomly split the datasets 1:1 into training and validation sets. I then used the FGES algorithm to search the training sets for the ground model and transition model, respectively. Next, I used the ML Bayes algorithm to estimate the conditional probability tables from the corresponding validation sets. Both models showed very high goodness-of-fit (Chi-square close to 1). Finally, I was able to use the approximate updater function to make predictions on disease progression given patient information.
Challenges I ran into
- Mixed data types
- Observational data with selection bias
- Time series of different lengths for each patient
- Building parametric models with mixed data
- The estimation step was initially unsuccessful due to the heap size limit of Java
Accomplishments that I'm proud of
- Previous analysis using a linear regression model indicated that patients who used cholinesterase inhibitors had a faster cognitive decline, which is against prior knowledge from randomized controlled clinical trials. My interpretation of this is that it is caused by the selection bias that patients with more severe symptoms tend to use cholinesterase inhibitors. By unrolling the time series into a ground model and a transition model, the proved that my explanation is correct.
- During my exploration of parametric models for mixed data, I found a function in tetrad named "Discrete" that can be used to model the discrete variables who are children of continuous variables. I used it to instantiate a generalized SEM model with mixed data.
- ML Bayes estimation for both models achieved high goodness-of-fit (p=1.0).
- The model now allows for predictions on disease progression given patient information!
What I learned
Most edges are congruent with prior knowledge. For example,
- Diabetes, hypertension (HTN) and heart disease seem to have common latent causes
- Cognitive function is dependent on AD diagnosis even after controlling the cognitive function of the previous year
- The use of cholinesterase inhibitors is affected by cognitive function (MMS) and year of visit (mms_yr), etc.
Some other interesting findings: -ApoE4 allele may be a contributing factor to education level (intelligence?) -Brain atrophy is associated with heart disease and hypertension
What's next for Alzheimer's disease and comorbidities
- Add more variables -Medications -Genetic information
- Build a prognostic model for predicting cognitive decline -Implement inference algorithms for personalized prediction
- Build a clinical decision support system for personalized medicine against AD -Integrate decision theory methods on top of the prognostic model -Make recommendations on medications that lead to an optimal outcome for each individual patient