Theory predicts that regression discontinuity (RD) provides valid causal inference at the cutoff score that determines treatment assignment. One purpose of this paper is to test RD's internal validity across 15 studies. Each of them assesses the correspondence between causal estimates from an RD study and a randomized control trial (RCT) when the estimates are made at the same cutoff point where they should not differ asymptotically. However, statistical error, imperfect design implementation, and a plethora of different possible analysis options, mean that they might nonetheless differ. We test whether they do, assuming that the bias potential is greater with RDs than RCTs. A second purpose of this paper is to investigate the external validity of RD by exploring how the size of the bias estimates varies across the 15 studies, for they differ in their settings, interventions, analyses, and implementation details. Both Bayesian and frequentist meta‐analysis methods show that the RD bias is below 0.01 standard deviations on average, indicating RD's high internal validity. When the study‐specific estimates are shrunken to capitalize on the information the other studies provide, all the RD causal estimates fall within 0.07 standard deviations of their RCT counterparts, now indicating high external validity. With unshrunken estimates, the mean RD bias is still essentially zero, but the distribution of RD bias estimates is less tight, especially with smaller samples and when parametric RD analyses are used.
Learning health systems use routinely collected electronic health data (EHD) to advance knowledge and support continuous learning. Even without randomization, observational studies can play a central role as the nation’s health care system embraces comparative effectiveness research and patient-centered outcomes research. However, neither the breadth, timeliness, volume of the available information, nor sophisticated analytics, allow analysts to confidently infer causal relationships from observational data. However, depending on the research question, careful study design and appropriate analytical methods can improve the utility of EHD.The introduction to a series of four papers, this review begins with a discussion of the kind of research questions that EHD can help address, noting how different evidence and assumptions are needed for each. We argue that when the question involves describing the current (and likely future) state of affairs, causal inference is not relevant, so randomized clinical trials (RCTs) are not necessary. When the question is whether an intervention improves outcomes of interest, causal inference is critical, but appropriately designed and analyzed observational studies can yield valid results that better balance internal and external validity than typical RCTs. When the question is one of translation and spread of innovations, a different set of questions comes into play: How and why does the intervention work? How can a model be amended or adapted to work in new settings? In these “delivery system science” settings, causal inference is not the main issue, so a range of quantitative, qualitative, and mixed research designs are needed.We then describe why RCTs are regarded as the gold standard for assessing cause and effect, how alternative approaches relying on observational data can be used to the same end, and how observational studies of EHD can be effective complements to RCTs. We also describe how RCTs can be a model for designing rigorous observational studies, building an evidence base through iterative studies that build upon each other (i.e., confirmation across multiple investigations).
Although funded projects explored many identified priority topics, investigators noted that much work remains. For example, given the considerable investments in CER data infrastructure, the methods development field can benefit from additional efforts to educate researchers about the availability of new data sources and about how best to apply methods to match their research questions and data.
This paper meta‐analyzes 12 heterogeneous studies that examine bias in the comparative interrupted time‐series design (CITS) that is often used to evaluate the effects of social policy interventions. To measure bias, each CITS impact estimate was differenced from the estimate derived from a theoretically unbiased causal benchmark study that tested the same hypothesis with the same treatment group, outcome data, and estimand. In 10 studies, the benchmark was a randomized experiment and in the other two it was a regression‐discontinuity study. Analyses revealed the average standardized CITS bias to be between −0.01 and 0.042 standard deviations; and all but one bias estimate from individual studies fell within 0.10 standard deviations of its benchmark, indicating that the near zero mean bias did not result from averaging many large single study differences. The low mean and generally tight distribution of individual bias estimates suggest that CITS studies are worth recommending for future causal hypothesis tests because: (1) over the studies examined, they generally resulted in high internal validity; and (2) they also promise high external validity because the empirical tests we synthesized occurred across a wide variety of settings, times, interventions, and outcomes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.