Background: Recently, there has been an increase in use of the stepped wedge trial (SWT) design in the context of health services research, due to its pragmatic and methodological advantages over the parallel group design. Our objective was to summarise the statistical methods used when conducting economic evaluations alongside SWTs.
Methods:A systematic literature search extending to February 2020 was conducted in the PubMed, Scopus, Cochrane and NHS-EED databases to find and evaluate studies where there was an intention to conduct an economic evaluation alongside a SWT. Studies were assessed for their eligibility, findings, reporting of statistical methods and quality of reporting.
Results:Of the 586 studies retrieved from the literature search, 69 studies were identified and included in this systematic review. 54 studies were published protocols, with eight economic evaluations and seven studies reporting full trial results. Included studies varied in terms of their reporting of statistical methods, in both detail and methodology. There were 34 studies that did not report any statistical methods for the economic evaluation and only 16 studies reported appropriate methods, mainly using some form of mixed/multilevel models and two used seemingly unrelated regression. 12 studies reported the use of generic bootstrap methods and other modelling techniques whilst the remaining studies failed to appropriately account for clustering, correlation or adjusted for time.
Conclusions:The use of appropriate statistical methods that account for time, clustering, and correlation between costs and outcomes is an important part of SWT health economics analysis that will benefit from an effort in communicating the methods available and their performance.
Key points for Decision Makers• In this methodological systematic review we have identified 69 papers reporting stepped wedge trials protocols (n=54), results (n=7) or economic evaluations (n=8).• Statistical methods of economic evaluations alongside stepped wedge trials were often poorly reported, lacking detail and methodology. Only 16 studies reported the use (or intention to use) multilevel/mixed models, 2 seemingly unrelated regression and 8 generic bootstrap.• It is important that appropriate statistical analyses that account for time, clustering and correlation between costs and outcomes -such as bivariate multilevel/mixed models, seemingly unrelated regressions, and the two-stage bootstrap methodare used.