ObjectiveUnbiased assessment of tumour response is crucial in randomised controlled trials (RCTs). Blinded independent central review is usually used as a supplemental or monitor to local assessment but is costly. The aim of this study is to investigate whether systematic bias existed in RCTs by comparing the treatment effects of efficacy endpoints between central and local assessments.DesignLiterature review, pooling analysis and correlation analysis.Data sourcesPubMed, from 1 January 2010 to 30 June 2017.Eligibility criteria for selecting studiesEligible articles are phase III RCTs comparing anticancer agents for advanced solid tumours. Additionally, the articles should report objective response rate (ORR), disease control rate (DCR), progression-free survival (PFS) or time to progression (TTP); the treatment effect of these endpoints, OR or HR, should be based on central and local assessments.ResultsOf 76 included trials involving 45 688 patients, 17 (22%) trials reported their endpoints with statistically inconsistent inferences (p value lower/higher than the probability of type I error) between central and local assessments; among them, 9 (53%) trials had statistically significant inference based on central assessment. Pooling analysis presented no systematic bias when comparing treatment effects of both assessments (ORR: OR=1.02 (95% CI 0.97 to 1.07), p=0.42, I2=0%; DCR: OR=0.97 (95% CI 0.92 to 1.03), p=0.32, I2=0%); PFS: HR=1.01 (95% CI 0.99 to 1.02), p=0.32, I2=0%; TTP: HR=1.04 (95% CI 0.95 to 1.14), p=0.37, I2=0%), regardless of funding source, mask, region, tumour type, study design, number of enrolled patients, response assessment criteria, primary endpoint and trials with statistically consistent/inconsistent inferences. Correlation analysis also presented no sign of systematic bias between central and local assessments (ORR, DCR, PFS: r>0.90, p<0.01; TTP: r=0.90, p=0.29).ConclusionsNo systematic bias could be found between local and central assessments in phase III RCTs on solid tumours. However, statistically inconsistent inferences could be made in many trials between both assessments.