STUDY QUESTION
What was the performance of different pretreatment prediction models for IVF, which were developed based on UK/US population (McLernon 2016 model, Luke model, Dhillon model, and McLernon 2022 model), in wider populations?
SUMMARY ANSWER
For a patient in China, the published pretreatment prediction models based on the UK/US population provide similar discriminatory power with reasonable AUCs and underestimated predictions.
WHAT IS KNOWN ALREADY
Several pretreatment prediction models for IVF allow patients and clinicians to estimate the cumulative probability of live birth in a cycle before the treatment, but they are mostly based on the population of Europe or the USA, and their performance and applicability in the countries and regions beyond these regions are largely unknown.
STUDY DESIGN, SIZE, DURATION
A total of 26 382 Chinese patients underwent oocyte pick-up cycles between January 2013 and December 2020.
PARTICIPANTS/MATERIALS, SETTING, METHODS
UK/US model performance was externally validated according to the coefficients and intercepts they provided. Centre-specific models were established with XGboost, Lasso, and generalized linear model algorithms. Discriminatory power and calibration of the models were compared as the forms of the AUC of the Receiver Operator Characteristic and calibration curves.
MAIN RESULTS AND THE ROLE OF CHANCE
The AUCs for McLernon 2016 model, Luke model, Dhillon model, and McLernon 2022 model were 0.69 (95% CI 0.68–0.69), 0.67 (95% CI 0.67–0.68), 0.69 (95% CI 0.68–0.69), and 0.67 (95% CI 0.67–0.68), respectively. The centre-specific yielded an AUC of 0.71 (95% CI 0.71–0.72) with key predictors including age, duration of infertility, and endocrine parameters. All external models suggested underestimation. Among the external models, the rescaled McLernon 2022 model demonstrated the best calibration (Slope 1.12, intercept 0.06).
LIMITATIONS, REASONS FOR CAUTION
The study is limited by its single-centre design and may not be representative elsewhere. Only per-complete cycle validation was carried out to provide a similar framework to compare different models in the sample population. Newer predictors, such as AMH, were not used.
WIDER IMPLICATIONS OF THE FINDINGS
Existing pretreatment prediction models for IVF may be used to provide useful discriminatory power in populations different from those on which they were developed. However, models based on newer more relevant datasets may provide better calibrations.
STUDY FUNDING/COMPETING INTEREST(S)
This work was supported by the National Natural Science Foundation of China [grant number 22176159], the Xiamen Medical Advantage Subspecialty Construction Project [grant number 2018296], and the Special Fund for Clinical and Scientific Research of Chinese Medical Association [grant number 18010360765].
TRIAL REGISTRATION NUMBER
N/A.