Development and internal validation of a machine-learning-developed model for predicting 1-year mortality after fragility hip fracture

Kitcharanant, Nitchanant; Chotiyarnwong, Pojchong; Tanphiriyakun, Thiraphat; Vanitcharoenkul, Ekasame; Mahaisavariya, Chantas; Boonyaprapa, Wichian; Unnanuntana, Aasis

doi:10.1186/s12877-022-03152-x

Cited by 13 publications

(18 citation statements)

References 65 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Of 39 studies that met all criteria and were included in this analysis, 18 studies (46.2%) used AI models to diagnose hip fractures on plain radiographs and 21 studies (53.8%) used AI models to predict patient outcomes following hip fracture surgery. A PRISMA flowchart of included studies is displayed in eFigure 1 in Supplement 1.…”

Section: Resultsmentioning

confidence: 99%

“…Machine learning models have been developed to predict the outcome of 6 different postoperative outcomes following hip fracture surgery: mortality (15 studies), length of stay (3 studies), delirium (1 study), discharge destination (1 study), hospital cost (1 study), 30-day major complications (1 study), and functional independence measure (1 study) (Table 2). Age (18 of 21 studies [85.7%]) and sex (17 of 21 studies [80.9%]) were the most used features, whereas all other input features varied widely across studies and databases (eTable 4 in Supplement 1).…”

Section: Resultsmentioning

confidence: 99%

“…In this study, the meta-analysis conducted to compare the accuracy of these ML models revealed that the models are comparable with the mean performance of expert clinicians at diagnosing hip fractures. Across all included studies, there was a wider range of sensitivity and specificity compared with clinician performance. However, the range was negatively skewed by 1 study that attempted to classify fractures into 3 different categories despite a relatively small training sample size.…”

Section: Discussionmentioning

confidence: 99%

“…Included Studies on Application of Artificial Intelligence for Diagnosis of Hip FracturesArtificial Intelligence for Hip Fracture Detection and Outcome Prediction Age (18 of 21 studies47,[49][50][51][52][54][55][56][57][58][59][60][61][63][64][65][66][67] [85.7%]) and sex (17 of 21 studies [80.9%][47][48][49][50][51][52][54][55][56][57][58]60,61,[64][65][66][67] …”

mentioning

confidence: 99%

See 3 more Smart Citations

Artificial Intelligence for Hip Fracture Detection and Outcome Prediction

et al. 2023

View full text Add to dashboard Cite

ImportanceArtificial intelligence (AI) enables powerful models for establishment of clinical diagnostic and prognostic tools for hip fractures; however the performance and potential impact of these newly developed algorithms are currently unknown.ObjectiveTo evaluate the performance of AI algorithms designed to diagnose hip fractures on radiographs and predict postoperative clinical outcomes following hip fracture surgery relative to current practices.Data SourcesA systematic review of the literature was performed using the MEDLINE, Embase, and Cochrane Library databases for all articles published from database inception to January 23, 2023. A manual reference search of included articles was also undertaken to identify any additional relevant articles.Study SelectionStudies developing machine learning (ML) models for the diagnosis of hip fractures from hip or pelvic radiographs or to predict any postoperative patient outcome following hip fracture surgery were included.Data Extraction and SynthesisThis study followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses and was registered with PROSPERO. Eligible full-text articles were evaluated and relevant data extracted independently using a template data extraction form. For studies that predicted postoperative outcomes, the performance of traditional predictive statistical models, either multivariable logistic or linear regression, was recorded and compared with the performance of the best ML model on the same out-of-sample data set.Main Outcomes and MeasuresDiagnostic accuracy of AI models was compared with the diagnostic accuracy of expert clinicians using odds ratios (ORs) with 95% CIs. Areas under the curve for postoperative outcome prediction between traditional statistical models (multivariable linear or logistic regression) and ML models were compared.ResultsOf 39 studies that met all criteria and were included in this analysis, 18 (46.2%) used AI models to diagnose hip fractures on plain radiographs and 21 (53.8%) used AI models to predict patient outcomes following hip fracture surgery. A total of 39 598 plain radiographs and 714 939 hip fractures were used for training, validating, and testing ML models specific to diagnosis and postoperative outcome prediction, respectively. Mortality and length of hospital stay were the most predicted outcomes. On pooled data analysis, compared with clinicians, the OR for diagnostic error of ML models was 0.79 (95% CI, 0.48-1.31; P = .36; I2 = 60%) for hip fracture radiographs. For the ML models, the mean (SD) sensitivity was 89.3% (8.5%), specificity was 87.5% (9.9%), and F1 score was 0.90 (0.06). The mean area under the curve for mortality prediction was 0.84 with ML models compared with 0.79 for alternative controls (P = .09).Conclusions and RelevanceThe findings of this systematic review and meta-analysis suggest that the potential applications of AI to aid with diagnosis from hip radiographs are promising. The performance of AI in diagnosing hip fractures was comparable with that of expert radiologists and surgeons. However, current implementations of AI for outcome prediction do not seem to provide substantial benefit over traditional multivariable predictive statistics.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Resultsmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

Artificial Intelligence for Hip Fracture Detection and Outcome Prediction

et al. 2023

View full text Add to dashboard Cite

show abstract

“…Elective noncardiac surgery Mortality 30 days and/or 1 yr [26][27][28][29][30][31][32][33][34][35][36][37][38] In surgical patients with perioperative SarS-CoV-2 39 Morbidity Multiple postoperative complications 26,27,29,[40][41][42][43][44][45][46][47][48][49][50][51] acute and chronic pain 52-57 acute kidney failure 52,58-63 aSa score prediction 64 Delirium and cognitive decline [65][66][67][68][69][70] Cerebral/myocardial infarction 71 Difficult intubation prediciton 72 Ileus 73 Infection risk [74][75][76] Myocardial injury 77 Nausea and vomiting 78 Obstructive apnoea screening 79 Perioperative transfusion 80,81 Postoperative atrial fibrillation 82 respiratory failure and depression Liver failure 117 Major bleeding 118,119 Kidney failure…”

Section: Surgery Outcomes and Eventsmentioning

confidence: 99%

Prediction of Complications and Prognostication in Perioperative Medicine: A Systematic Review and PROBAST Assessment of Machine Learning Tools

Arina,

Kaczorek,

Hofmaenner

et al. 2023

Anesthesiology

View full text Add to dashboard Cite

Background The utilization of artificial intelligence and machine learning as diagnostic and predictive tools in perioperative medicine holds great promise. Indeed, many studies have been performed in recent years to explore the potential. The purpose of this systematic review is to assess the current state of machine learning in perioperative medicine, its utility in prediction of complications and prognostication, and limitations related to bias and validation. Methods A multidisciplinary team of clinicians and engineers conducted a systematic review using the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) protocol. Multiple databases were searched, including Scopus, Cumulative Index to Nursing and Allied Health Literature (CINAHL), the Cochrane Library, PubMed, Medline, Embase, and Web of Science. The systematic review focused on study design, type of machine learning model used, validation techniques applied, and reported model performance on prediction of complications and prognostication. This review further classified outcomes and machine learning applications using an ad hoc classification system. The Prediction model Risk Of Bias Assessment Tool (PROBAST) was used to assess risk of bias and applicability of the studies. Results A total of 103 studies were identified. The models reported in the literature were primarily based on single-center validations (75%), with only 13% being externally validated across multiple centers. Most of the mortality models demonstrated a limited ability to discriminate and classify effectively. The PROBAST assessment indicated a high risk of systematic errors in predicted outcomes and artificial intelligence or machine learning applications. Conclusions The findings indicate that the development of this field is still in its early stages. This systematic review indicates that application of machine learning in perioperative medicine is still at an early stage. While many studies suggest potential utility, several key challenges must be first overcome before their introduction into clinical practice. Editor’s Perspective What We Already Know about This Topic What This Article Tells Us That Is New

show abstract