Protocol for a systematic review on the methodological and reporting quality of prediction model studies using machine learning techniques

Navarro, Constanza L Andaur; Damen, Johanna A A; Takada, Toshihiko; Nijman, Steven W J; Dhiman, Paula; Ma, Jie; Collins, Gary S.; Bajpai, Ram; Riley, Richard D; Moons, Karel G.M.; Hooft, Lotty

doi:10.1136/bmjopen-2020-038832

“…Our protocol lists the inclusion and exclusion criteria. 22 A study was also considered eligible if it aimed to develop a prediction model based on model extension or incremental value of new predictors. No restrictions were applied based on study design, data source, or types of patient related health outcomes.…”

Section: Methodsmentioning

confidence: 99%

“…Our systematic review was reported following the preferred reporting items for systematic reviews and meta-analyses statement 21. The review protocol was registered and has been published 22…”

Section: Methodsmentioning

confidence: 99%

“…Eligible publications needed to describe the development or validation of at least one multivariable prediction model using any supervised machine learning technique that aimed for individualised prediction of risk of patient related health outcomes. Our protocol lists the inclusion and exclusion criteria 22. A study was also considered eligible if it aimed to develop a prediction model based on model extension or incremental value of new predictors.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review

Navarro

¹

,

Damen

²

,

Takada

³

et al. 2021

BMJ

View full text Add to dashboard Cite

Objective To assess the methodological quality of studies on prediction models developed using machine learning techniques across all medical specialties. Design Systematic review. Data sources PubMed from 1 January 2018 to 31 December 2019. Eligibility criteria Articles reporting on the development, with or without external validation, of a multivariable prediction model (diagnostic or prognostic) developed using supervised machine learning for individualised predictions. No restrictions applied for study design, data source, or predicted patient related health outcomes. Review methods Methodological quality of the studies was determined and risk of bias evaluated using the prediction risk of bias assessment tool (PROBAST). This tool contains 21 signalling questions tailored to identify potential biases in four domains. Risk of bias was measured for each domain (participants, predictors, outcome, and analysis) and each study (overall). Results 152 studies were included: 58 (38%) included a diagnostic prediction model and 94 (62%) a prognostic prediction model. PROBAST was applied to 152 developed models and 19 external validations. Of these 171 analyses, 148 (87%, 95% confidence interval 81% to 91%) were rated at high risk of bias. The analysis domain was most frequently rated at high risk of bias. Of the 152 models, 85 (56%, 48% to 64%) were developed with an inadequate number of events per candidate predictor, 62 handled missing data inadequately (41%, 33% to 49%), and 59 assessed overfitting improperly (39%, 31% to 47%). Most models used appropriate data sources to develop (73%, 66% to 79%) and externally validate the machine learning based prediction models (74%, 51% to 88%). Information about blinding of outcome and blinding of predictors was, however, absent in 60 (40%, 32% to 47%) and 79 (52%, 44% to 60%) of the developed models, respectively. Conclusion Most studies on machine learning based prediction models show poor methodological quality and are at high risk of bias. Factors contributing to risk of bias include small study size, poor handling of missing data, and failure to deal with overfitting. Efforts to improve the design, conduct, reporting, and validation of such studies are necessary to boost the application of machine learning based prediction models in clinical practice. Systematic review registration PROSPERO CRD42019161764.

show abstract

“…Our systematic review protocol was registered (PROSPERO, CRD42019161764) and published [ 19 ]. We reported this systematic review following the PRISMA statement [ 20 ].…”

Section: Methodsmentioning

confidence: 99%

Completeness of reporting of clinical prediction models developed using supervised machine learning: a systematic review

Navarro

¹

,

Damen

²

,

Takada

³

et al. 2022

BMC Med Res Methodol

Self Cite

View full text Add to dashboard Cite

Background While many studies have consistently found incomplete reporting of regression-based prediction model studies, evidence is lacking for machine learning-based prediction model studies. We aim to systematically review the adherence of Machine Learning (ML)-based prediction model studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Statement. Methods We included articles reporting on development or external validation of a multivariable prediction model (either diagnostic or prognostic) developed using supervised ML for individualized predictions across all medical fields. We searched PubMed from 1 January 2018 to 31 December 2019. Data extraction was performed using the 22-item checklist for reporting of prediction model studies (www.TRIPOD-statement.org). We measured the overall adherence per article and per TRIPOD item. Results Our search identified 24,814 articles, of which 152 articles were included: 94 (61.8%) prognostic and 58 (38.2%) diagnostic prediction model studies. Overall, articles adhered to a median of 38.7% (IQR 31.0–46.4%) of TRIPOD items. No article fully adhered to complete reporting of the abstract and very few reported the flow of participants (3.9%, 95% CI 1.8 to 8.3), appropriate title (4.6%, 95% CI 2.2 to 9.2), blinding of predictors (4.6%, 95% CI 2.2 to 9.2), model specification (5.2%, 95% CI 2.4 to 10.8), and model’s predictive performance (5.9%, 95% CI 3.1 to 10.9). There was often complete reporting of source of data (98.0%, 95% CI 94.4 to 99.3) and interpretation of the results (94.7%, 95% CI 90.0 to 97.3). Conclusion Similar to prediction model studies developed using conventional regression-based techniques, the completeness of reporting is poor. Essential information to decide to use the model (i.e. model specification and its performance) is rarely reported. However, some items and sub-items of TRIPOD might be less suitable for ML-based prediction model studies and thus, TRIPOD requires extensions. Overall, there is an urgent need to improve the reporting quality and usability of research to avoid research waste. Systematic review registration PROSPERO, CRD42019161764.

show abstract

“…Our systematic review protocol was registered (PROSPERO, CRD42019161764) and published. 18 We reported this systematic review following the PRISMA statement. 19…”

Section: Methodsmentioning

confidence: 99%

Completeness of reporting of clinical prediction models developed using supervised machine learning: A systematic review

Cl

¹

,

Damen

²

,

Takada

³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Objective. While many studies have consistently found incomplete reporting of regression-based prediction model studies, evidence is lacking for machine learning-based prediction model studies. Our aim is to systematically review the adherence of Machine Learning (ML)-based prediction model studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Statement. Study design and setting: We included articles reporting on development or external validation of a multivariable prediction model (either diagnostic or prognostic) developed using supervised ML for individualized predictions across all medical fields (PROSPERO, CRD42019161764). We searched PubMed from 1 January 2018 to 31 December 2019. Data extraction was performed using the 22-item checklist for reporting of prediction model studies (www.TRIPOD-statement.org). We measured the overall adherence per article and per TRIPOD item. Results: Our search identified 24 814 articles, of which 152 articles were included: 94 (61.8%) prognostic and 58 (38.2%) diagnostic prediction model studies. Overall, articles adhered to a median of 38.7% (IQR 31.0-46.4) of TRIPOD items. No articles fully adhered to complete reporting of the abstract and very few reported the flow of participants (3.9%, 95% CI 1.8 to 8.3), appropriate title (4.6%, 95% CI 2.2 to 9.2), blinding of predictors (4.6%, 95% CI 2.2 to 9.2), model specification (5.2%, 95% CI 2.4 to 10.8), and model's predictive performance (5.9%, 95% CI 3.1 to 10.9). There was often complete reporting of source of data (98.0%, 95% CI 94.4 to 99.3) and interpretation of the results (94.7%, 95% CI 90.0 to 97.3). Conclusion. Similar to studies using conventional statistical techniques, the completeness of reporting is poor. Essential information to decide to use the model (i.e. model specification and its performance) is rarely reported. However, some items and sub-items of TRIPOD might be less suitable for ML-based prediction model studies and thus, TRIPOD requires extensions. Overall, there is an urgent need to improve the reporting quality and usability of research to avoid research waste.

show abstract

Protocol for a systematic review on the methodological and reporting quality of prediction model studies using machine learning techniques

Cited by 65 publications

References 34 publications

Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review

Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review

Completeness of reporting of clinical prediction models developed using supervised machine learning: a systematic review

Completeness of reporting of clinical prediction models developed using supervised machine learning: A systematic review

Contact Info

Product

Resources

About