OBJECTIVES:Evaluating the patient impact of health professions education is a societal priority with many challenges. Researchers would benefit from a summary of topics studied and potential methodological problems. We sought to summarize key information on patient outcomes identified in a comprehensive systematic review of simulation-based instruction. DATA SOURCES: Systematic search of MEDLINE, EMBASE, CINAHL, PsychINFO, Scopus, key journals, and bibliographies of previous reviews through May 2011. STUDY ELIGIBILITY: Original research in any language measuring the direct effects on patients of simulationbased instruction for health professionals, in comparison with no intervention or other instruction. APPRAISAL AND SYNTHESIS: Two reviewers independently abstracted information on learners, topics, study quality including unit of analysis, and validity evidence. We pooled outcomes using random effects. RESULTS: From 10,903 articles screened, we identified 50 studies reporting patient outcomes for at least 3,221 trainees and 16,742 patients. Clinical topics included airway management (14 studies), gastrointestinal endoscopy (12), and central venous catheter insertion (8). There were 31 studies involving postgraduate physicians and seven studies each involving practicing physicians, nurses, and emergency medicine technicians. Fourteen studies (28 %) used an appropriate unit of analysis. Measurement validity was supported in seven studies reporting content evidence, three reporting internal structure, and three reporting relations with other variables. The pooled Hedges' g effect size for 33 comparisons with no intervention was 0.47 (95 % confidence interval [CI], 0.31-0.63); and for nine comparisons with non-simulation instruction, it was 0.36 (95 % CI, −0.06 to 0.78). LIMITATIONS: Focused field in education; high inconsistency (I 2 >50 % in most analyses). CONCLUSIONS: Simulation-based education was associated with small-moderate patient benefits in comparison with no intervention and non-simulation instruction, although the latter did not reach statistical significance. Unit of analysis errors were common, and validity evidence was infrequently reported.