Causal graphs are used in epidemiological methods to help investigators identify potential confounding variables (Greenland, S., Pearl, J., & Robins, J. M. (1999). Causal diagrams for epidemiologic research. Epidemiology, 10(1), 37–48); and can contribute to the selection and use of variables in regression models (Elwert, F., & Winship, C. (2010). Effect Heterogeneity and Bias in Main-Effects-Only Regression Models. In Heuristics, Probability and Causality A tribute to Judea Pearl (pp. 1–10).). 

Today I want to discuss the use of causal graphs in experimental design trials (such as RCTs) when the RCT is being done in physical therapy as an attempt to generate knowledge for clinical practice. As you may know, much of my thinking on the topic of this blog started to formalize after responding to a 2005 editorial in PTJ (here). I was responding to this question by the editor: “What can be done to stimulate more research in physical therapy that has direct clinical relevance?” (Jette)

Adding causal graphs would not stimulate more research, but I will argue that it would result in more RCTs having direct clinical relevance. Causal graphs would do the same thing for RCTs that they do for epidemiological designs. Causal graphs (DAGs in particular) would highlight the potential alternate explanations of findings (confounders). In an observational design this is essential. However, this has not typically been considered a problem for RCTs. Why not? Because RCTs account for all alternate effects (known and unknown) through randomization. After all, if randomization works then both (or all) groups are similar to one another in an RCT in all ways other than the trial intervention.

However the RCT’s method of controlling for alternative effects is to basically ignore them and hope that the main effect of the trial intervention is greater than all of the noise created by any other effects that are randomly distributed between the groups. When the intervention effects are small it will take a very large study to isolate the signal of that effect from the noise of randomly distributed effects. Even when effects are large enough, or when studies are large enough to demonstrate whether the intervention has an effect there are most likely interacting cause - effect relationships that are hidden within the noise associated with randomization. If you are interested, I have discussed these issues regarding RCTs generally previously on the blog (here).

RCTs will sometimes include a subgroup analysis of responders vs. non responders in an attempt to parse this noisy variation, to understand some of the underlying causal structure. This requires the investigators to have observed (measured, adjusted for) a set of variables that might be helpful to such an endeavor (i.e. an adjustment set). The collecting of these variables indicates that investigators had some prior knowledge about what variables might be important (prior knowledge related to the underlying causal structure of the system they are studying (Hancock, M., Herbert, R. D., & Maher, C. G. (2009). A guide to interpretation of studies investigating subgroups of responders to physical therapy interventions. Physical Therapy, 89(7), 698–704.).

The use of a DAG in the design and reporting of an RCT would help the reader (and potential user) understand the underlying assumptions by the researchers regarding the causal structure. It would help the researchers pick the best variables to measure for subgroup analysis (those that have the potentially largest confounding effect and make identifiability of a intervention effect most challenging).

Now I would like to present a quick example based on a prior post called “Train those Inspiratory Muscles” (here).


Based on this DAG of the relationship between air flow reduction (the pathomechanism associated with ventilatory problems in COPD) and one of its clinical manifestations (dyspnea on exertion (DOE)) we see that dynamic hyperinflation plays a role and confounds the effect of having a low maximal inspiratory pressure (a measure of inspiratory muscle strength. Inspiratory muscle training (IMT) is a specific intervention that attempts to address Low MIP. However, in patients with the primary cause of Low MIP being DHI this intervention may be less effective. Measuring DHI and adjusting for this confounder in the subgroup analysis, or even stratifying on it prior to randomization with a block design, would result in a study that is more helpful to therapists considering the potential benefit of IMT in their specific patient as it would isolate another factor that influences the effectiveness of IMT based on the underlying causal structure (Seems simple enough - though primarily not done in these studies on IMT.)

Additionally, if anyone disagrees with the decisions associated with the design (stratification or subgroup analysis) then they at least they see the underlying assumptions and whether the approach taken was rationale. For example, when assessing a trial I need to decide whether I disagree with the underlying causal assumptions of the authors or not. If I disagree I need to consider the grounds on which I disagree. I may disagree but be willing to accept their assumptions because I may not have good ground to disagree. Then, accepting their assumptions, I can assess the rationality of their stratification and/or subgroup analysis, which follows logically from the causal structure. If I do not accept their underlying assumptions and I have good grounds for why that structure is wrong, or if there is a irrational use (improper use of logic) of an accepted structure then I can communicate these concerns (either during the review of the paper, or post publication as an editorial). But they are very separate concerns and should be articulated as such.

So when done right the use of causal models (DAGs) help with the design of clinical practice relevant studies. When used the process of reviewing the study becomes easier (assumptions are more clearly presented). Therefore when done wrong the use of causal models helps with the review and modification of design or analysis.

So to the question: “What can be done to stimulate more research in physical therapy that has direct clinical relevance?” Since causal structure underlies practice and research - more effort should be made to articulate the causal structures in the design and dissemination of the research for it to have direct clinical relevance.

Finally - the DAG (link to DAGitty, and code for modifying) above is available as part of the Physical Therapy DAG repository on GitHub here in the COPD repo.