In evidence based practice randomized controlled trials (RCTs) have a very high standing. In fact, by the GRADE approach to weighing evidence for a clinical practice guideline (CPG) a single, large, well conducted RCT can result in an evidence rating as high as a systematic review of RCTs. This post is not an attempt to argue against the use of RCTs to develop knowledge for practice. The purpose is to simply share some thoughts about the limitations of RCTs and what KBP suggests for RCT method0logy planning.

RCTs are highly regarded because when they demonstrate an effect with a large sample then the cause tested is the most likely explanation for the observed effect (low risk of bias in well designed, large sample RCTs). There will be variability in the effect and the amount of variability is important to consider for the clinician as the patients you treat are not the mean of a sample, they are a single person and will likely fall somewhere within the full range of possible variability, not necessarily the mean, and not necessarily within the 95% Confidence Interval which is based on the standard error of the mean and therefore still represents sample mean statistics. (Proper interpretation of a 95% CI is that if we were to draw 100 random samples of the same size, 95 of them would have a mean falling within the estimated 95% CI. It is not that 95% of individual data points will fall within that range.)

This is a good time to explain a purpose of KBP that emerges from my field - physical therapy - that may not be present in other fields (i.e. medicine or surgery). As indicated way back in my 2005 letter to the editor of the PT Journal, physical therapists are immersed in a lot of variability in practice. Not just variability inherent in patients with the same isolated condition when it can be diagnosed (heart failure, knee OA, multiple sclerosis), but the variability the emerges from the pathological (cell, tissue, organ) level of reality to the impairment (body structure / function) level of reality, to the functional (activity) level of reality to the interactive (person - environment interaction, disability, participation) level of reality.

The challenges presented by variability above are basically challenges surrounding the ability to isolate. What needs to be isolated? A cause - effect relationship. For some disciplines there is a challenge in identifying a cause, but once identified it is “the cause” of the problem and if there is a solution, a fix, an intervention that addresses “the cause” then once done we expect an effect. People can have variations in all other aspects of their being, but due to that isolated and identified “cause” they have an “effect” that is not welcome. Since the cause is isolated and identified, whether an intervention can remove the cause, fix the cause, change the cause and the resultant effect can be studied effectively with an RCT. In other words, RCTs are best for interventions that work on isolated causes, and to measure their effect with an RCT the outcome should be the most proximal expected effect of removing the cause.

The strength of an RCT is it’s ability to isolate the impact of a intervention (that is when an intervention is the experimental condition). It does not isolate by rendering all other mechanisms null in reality. It isolates by randomizing all other mechanisms and possible explanations between groups so that the only assumed difference between the groups is the experimental condition. There are lots of implications related to this process in reality and the use of the results for practice decisions. If the RCT is based on a very isolated sample (same underlying cause of their problem), and is able to provide a very directed intervention (eliminating the cause with possible or definite elimination) then there should be a strong association with a large magnitude of effect.

A problem can arise with null findings. Null findings may not be due to an intervention’s lack of elimination of the underlying cause. A null finding could be directly associated with a sample that is too variable, where the underlying cause is not able to be, or simply is not, isolated enough to known to be present when recruiting subjects. Such variability in the sample biases the RCT to the null hypothesis. So, recruiting patients with “back pain” for a trial of traction when traction is only mechanistically expected to work on certain causes of back pain; recruiting patients with HF for an electrical stimulation of skeletal muscles trial when we only expect electrical stimulation would work with patients with HF and specific isolated muscle weakness. There is a movement within trials to evaluate responders and non-responders, from a clinical practice perspective this is an excellent addition to trials. But several PT interventions are not done in isolation, and many more have proximal causes that are far from the measured outcome. For example, ultrasound simply warms tissues to a certain depth. A proximal outcome for a trial of ultrasound is whether or not, or to what extent it warms the tissue. If the tissue is warmed, the next causal chain is whether warming the tissue has the expected effect warm tissues should demonstrate. The next causal chain is whether the effect of warming has the effect expected for ……………(fill in the blank - but chances are it continues with a causal chain for several steps before we get to one that people would measure in an RCT). My point is this - the longer the causal chain, the more opportunities for variability in response and covariates to be critical to our understanding.

What’s the point with regard to KBP? KBP is strongly recommending the explicit use of established knowledge, represented as causal models (and even the tool of DAGs to represent the causal structure of the models) for induction, deduction and abduction. In recent posts we have discussed adjustment sets - those well established for induction based on observational studies, and proposals for adjustment sets for deduction and abduction. RCTs, due to their removal of alternative explanations from randomization, have less often made explicit use of causal models to justify adjustment sets of covariates, or sampling approaches and methods. A KBP would caution against this and would highly recommend an approach to the use of DAGs for RCT planning that parallels that used for epidemiological / observational studies.