Comparison of Machine Learning Methods With Traditional Models for Use of Administrative Claims With Electronic Medical Records to Predict Heart Failure Outcomes

Desai, Rishi; Wang, Shirley; Vaduganathan, Muthiah; Evers, Thomas; Schneeweiß, Sebastian

doi:10.1001/jamanetworkopen.2019.18962

Cited by 201 publications

(164 citation statements)

References 32 publications

Supporting

Mentioning

153

Contrasting

Unclassified

Order By: Relevance

“…A UK Biobank study of risk prediction for cardiovascular disease did not report how censoring was dealt with, 7 like several other studies. [39][40][41] Another machine learning study incorrectly excluded censored patients. 8 Random survival forest is a machine learning model that takes account of censoring.…”

Section: Discussionmentioning

confidence: 99%

Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar

Sperrin

Ashcroft

et al. 2020

BMJ

View full text Add to dashboard Cite

ObjectiveTo assess the consistency of machine learning and statistical techniques in predicting individual level and population level risks of cardiovascular disease and the effects of censoring on risk predictions.DesignLongitudinal cohort study from 1 January 1998 to 31 December 2018.Setting and participants3.6 million patients from the Clinical Practice Research Datalink registered at 391 general practices in England with linked hospital admission and mortality records.Main outcome measuresModel performance including discrimination, calibration, and consistency of individual risk prediction for the same patients among models with comparable model performance. 19 different prediction techniques were applied, including 12 families of machine learning models (grid searched for best models), three Cox proportional hazards models (local fitted, QRISK3, and Framingham), three parametric survival models, and one logistic model.ResultsThe various models had similar population level performance (C statistics of about 0.87 and similar calibration). However, the predictions for individual risks of cardiovascular disease varied widely between and within different types of machine learning and statistical models, especially in patients with higher risks. A patient with a risk of 9.5-10.5% predicted by QRISK3 had a risk of 2.9-9.2% in a random forest and 2.4-7.2% in a neural network. The differences in predicted risks between QRISK3 and a neural network ranged between –23.2% and 0.1% (95% range). Models that ignored censoring (that is, assumed censored patients to be event free) substantially underestimated risk of cardiovascular disease. Of the 223 815 patients with a cardiovascular disease risk above 7.5% with QRISK3, 57.8% would be reclassified below 7.5% when using another model.ConclusionsA variety of models predicted risks for the same patients very differently despite similar model performances. The logistic models and commonly used machine learning models should not be directly applied to the prediction of long term risks without considering censoring. Survival models that consider censoring and that are explainable, such as QRISK3, are preferable. The level of consistency within and between models should be routinely assessed before they are used for clinical decision making.

show abstract

Section: Discussionmentioning

confidence: 99%

Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar

Sperrin

Ashcroft

et al. 2020

BMJ

View full text Add to dashboard Cite

show abstract

“…An important advantage of ML techniques compared to conventional prognostic algorithms is that ML techniques do not assume linear relationships between variables and outcomes, thus resulting in better performance in identifying individualized outcome predictions [ 27 ]. Recent data show that ML algorithms outperform logistic regression models in the prediction of HF outcomes [ 28 – 30 ]. Specifically, the better accuracy of ML algorithms compared to conventional tools has been demonstrated for the prediction of mortality in the setting of acute HF [ 30 ], mortality and hospitalization for HFpEF [ 29 ], and hospital readmissions [ 31 ].…”

Section: Discussionmentioning

confidence: 99%

Machine learning versus conventional clinical methods in guiding management of heart failure patients—a systematic review

et al. 2020

View full text Add to dashboard Cite

Machine learning (ML) algorithms “learn” information directly from data, and their performance improves proportionally with the number of high-quality samples. The aim of our systematic review is to present the state of the art regarding the implementation of ML techniques in the management of heart failure (HF) patients. We manually searched MEDLINE and Cochrane databases as well the reference lists of the relevant review studies and included studies. Our search retrieved 122 relevant studies. These studies mainly refer to (a) the role of ML in the classification of HF patients into distinct categories which may require a different treatment strategy, (b) discrimination of HF patients from the healthy population or other diseases, (c) prediction of HF outcomes, (d) identification of HF patients from electronic records and identification of HF patients with similar characteristics who may benefit form a similar treatment strategy, (e) supporting the extraction of important data from clinical notes, and (f) prediction of outcomes in HF populations with implantable devices (left ventricular assist device, cardiac resynchronization therapy). We concluded that ML techniques may play an important role for the efficient construction of methodologies for diagnosis, management, and prediction of outcomes in HF patients. Electronic supplementary material The online version of this article (10.1007/s10741-020-10007-3) contains supplementary material, which is available to authorized users.

show abstract

“…In this study, we used RHWU cohort as an external validation set to verify the survival nomogram derived from SEER database. External validation is an indispensable step which integrates the nomogram into the different study population [39]. External validation could detect the generalizability of the survival nomogram and ultimately avoid poor goodness-of-fit [40].…”

Section: Discussionmentioning

confidence: 99%

Personalizing prognostic prediction in early-onset Colorectal Cancer

Liu

et al. 2020

J. Cancer

View full text Add to dashboard Cite

Accurately estimating prognosis based on clinicopathologic variables could improve risk stratification for patients with early-onset colorectal cancer (EOCRC). Our primary goal was to create and validate a survival nomogram with adequate performance for predicting overall survival (OS) in patients with EOCRC. Least absolute shrinkage and selection operator (LASSO) Cox regression analysis was applied to identify clinical features statistically related to OS. Then we established and internally validated a survival nomogram based on surveillance, epidemiology and end results (SEER) database (N=23813). A cohort of 77 patients with EOCRC from Renmin Hospital of Wuhan University (RHWU) was employed to detect the external validity of the survival nomogram. Moreover, we compared the predictive accuracy of survival nomogram with TNM stage, and also compared the OS between endoscopy and surgery groups before and after propensity score matching (PSM) among EOCRC patients with early stage (Tis-T1N0M0). We selected seven informative indexes (N stage, M stage, perineural invasion, chemotherapy, surgery primary site, summary stage and tumor grade) for the construction of the survival nomogram. Then the survival nomogram exhibited good discrimination with C-index of 0.829, 0.841 and 0.796 in the SEER training, SEER validation and RHWU validation sets, respectively. Calibration curves showed good concordance between the survival nomogram predictions and actual outcomes for 1-year, 3-year and 5-year OS. Furthermore, the survival nomogram was superior to risk stratification by TNM stage in predicting OS among patients with EOCRC. Early-stage patients treated with endoscopy showed similar survival to those with surgery before and after PSM. We proposed a survival nomogram based on the extensively used parameters to precisely predict OS in EOCRC patients. This survival nomogram will contribute to aid oncologists better risk stratification and prognostication for patients with EOCRC.

show abstract

Comparison of Machine Learning Methods With Traditional Models for Use of Administrative Claims With Electronic Medical Records to Predict Heart Failure Outcomes

Cited by 201 publications

References 32 publications

Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar

Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar

Machine learning versus conventional clinical methods in guiding management of heart failure patients—a systematic review

Personalizing prognostic prediction in early-onset Colorectal Cancer

Contact Info

Product

Resources

About