BackgroundThe lens of complexity theory is widely advocated to improve health care delivery. However, empirical evidence that this lens has been useful in designing health care remains elusive. This review assesses whether it is possible to reliably capture evidence for efficacy in results or process within interventions that were informed by complexity science and closely related conceptual frameworks.MethodsSystematic searches of scientific and grey literature were undertaken in late 2015/early 2016. Titles and abstracts were screened for interventions (A) delivered by the health services, (B) that explicitly stated that complexity science provided theoretical underpinning, and (C) also reported specific outcomes. Outcomes had to relate to changes in actual practice, service delivery or patient clinical indicators. Data extraction and detailed analysis was undertaken for studies in three developed countries: Canada, UK and USA. Data were extracted for intervention format, barriers encountered and quality aspects (thoroughness or possible biases) of evaluation and reporting.ResultsFrom 5067 initial finds in scientific literature and 171 items in grey literature, 22 interventions described in 29 articles were selected. Most interventions relied on facilitating collaboration to find solutions to specific or general problems. Many outcomes were very positive. However, some outcomes were measured only subjectively, one intervention was designed with complexity theory in mind but did not reiterate this in subsequent evaluation and other interventions were credited as compatible with complexity science but reported no relevant theoretical underpinning. Articles often omitted discussion on implementation barriers or unintended consequences, which suggests that complexity theory was not widely used in evaluation.ConclusionsIt is hard to establish cause and effect when attempting to leverage complex adaptive systems and perhaps even harder to reliably find evidence that confirms whether complexity-informed interventions are usually effective. While it is possible to show that interventions that are compatible with complexity science seem efficacious, it remains difficult to show that explicit planning with complexity in mind was particularly valuable. Recommendations are made to improve future evaluation reports, to establish a better evidence base about whether this conceptual framework is useful in intervention design and implementation.Electronic supplementary materialThe online version of this article (doi:10.1186/s13012-016-0492-5) contains supplementary material, which is available to authorized users.