A predictive model of fall risk based on this causal structure would simply need two variables, Decreased LE Power (binary) or LE Power (continuous) and Balance (binary or continuous). A model could be trained and tested in a sample of similar characteristics - let’s say all elderly subjects (over 70 years) living in an assisted living community. Training group data yields parameters for model equations, testing group data is used to test whether the model equations accurately predicted fall risk in the testing group. These two groups, based on the same sample, are similar in any unmeasured ways as people over 70 living in an assisted living community. This model is therefore best when tested with the same sort of sample. Not a sample living in a nursing home, or a sample living at home, not a group more or less active than the training and testing sample. Where people attempt to walk will influence their risk (i.e. active life style) in a way that this model does not account for; and the impact of having very low power is out of the range of this model’s parameters for a group in a nursing home. This model does not account for cognitive status. Which could be very important in the sample that is in the nursing home where attempting to get up when they shouldn’t is a major cause of falling.
Predictive analytics are based on causal models. Data allows us to test for associations and to derive parameters. A rationale process of interpretation that fits that data, and uses it to test the assumptions of a causal model is necessary. Training complex models without any consideration of the causal model is less helpful to the underlying clinical reasoning as a predictive analytic tool. Trying to say that a big data predictive model is “associational” - that is purely based on statistical associations and not causes is an assumption that undermines the entire purpose of a predictive model. Only when we understand the cause - effect structure can you claim to predict future events based on values of causes and resultant (future) changes to effects. The causal structure helps determine the generalizability of the model and what samples it is most likely to result in the same predictive accuracy.
For students taking pathology what does this mean? It means that if you are looking to make a DAG for prognosis (predictive) - you simply make a DAG of the cause - effect structure that focuses on pathomechanisms to clinical manifestations since clinical manifestations would almost always include the set of outcomes we are interested in predicting. Remember, clinical manifestations include the consequences of pathomechanisms.