Predicting in-hospital mortality in ICU patients with sepsis using gradient boosting decision tree

Li, Ké; Shi, Qianling; Liu, Siru; Xie, Yan; Liu, Jialin

doi:10.1097/md.0000000000025813

Cited by 34 publications

(20 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…(2020) SIRS + infection + end organ failure Early detection of sepsis ED of a quaternary academic hospital NR NR NR NR van Doorn et al. (2021) Infection + SIRS/SOFA Mortality prediction of sepsis ED at the Maastricht University Medical Center+ NR 1244 NR 100 Li et al. (2021) ICD-9 Mortality prediction of sepsis MIMIC-III V1.4 Remove the patients with data missing more than 30% + Replace by mean value NR NR NR Burdick et al.…”

Section: Resultsmentioning

confidence: 99%

“…(2020) 1 0 1 0 0 1 1 0 van Doorn et al. (2021) 1 1 1 1 1 1 1 1 Li et al. (2021) 1 1 1 1 1 1 1 1 Burdick et al.…”

Section: Resultsunclassified

“…(2020) XGBoost eXtreme Gradient Boosting 0.857 30 days logistic regression logistic regression 0.819 SAPS-II scores Simplified acute physiology score-II 0.797 Kong et al. (2020) LASSO least absolute shrinkage and selection operator 0.829 In hospital RF random forest 0.829 GBM gradient boosting machine 0.845 LR logistic regression 0.833 SAPS II Simplified acute physiology score-II 0.77 Li et al. (2021) GBDT GBDT 0.992 In hospital LR Logistic regression 0.876 KNN k-nearest neighbor 0.877 RF Random forest 0.980 SVM Support vector machine 0.898 Qi et al.…”

Section: Resultsmentioning

confidence: 99%

See 2 more Smart Citations

Evaluating machine learning models for sepsis prediction: A systematic review of methodologies

Deng

Sun

et al. 2022

iScience

View full text Add to dashboard Cite

Summary Studies for sepsis prediction using machine learning are developing rapidly in medical science recently. In this review, we propose a set of new evaluation criteria and reporting standards to assess 21 qualified machine learning models for quality analysis based on PRISMA. Our assessment shows that (1.) the definition of sepsis is not consistent among the studies; (2.) data sources and data preprocessing methods, machine learning models, feature engineering, and inclusion types vary widely among the studies; (3.) the closer to the onset of sepsis, the higher the value of AUROC is; (4.) the improvement in AUROC is primarily due to using machine learning as a feature engineering tool; (5.) deep neural networks coupled with Sepsis-3 diagnostic criteria tend to yield better results on the time series data collected from patients with sepsis. The new evaluation criteria and reporting standards will facilitate the development of improved machine learning models for clinical applications.

show abstract

Section: Resultsmentioning

confidence: 99%

“…(2020) 1 0 1 0 0 1 1 0 van Doorn et al. (2021) 1 1 1 1 1 1 1 1 Li et al. (2021) 1 1 1 1 1 1 1 1 Burdick et al.…”

Section: Resultsunclassified

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

Evaluating machine learning models for sepsis prediction: A systematic review of methodologies

Deng

Sun

et al. 2022

iScience

View full text Add to dashboard Cite

show abstract

“…The inclusion criteria were as follows: (1) patients were diagnosed with HF according to the International Classification of Diseases, ninth and tenth Revision codes ( Multimedia Appendix 1 ); (2) the diagnosis priority label was “primary” when admitted to the ICU in 24 hours; (3) the ICU stay was more than 1 day; and (4) patients were aged 18 years or older. Patients who had more than 30% missing values were excluded [ 18 ].…”

Section: Methodsmentioning

confidence: 99%

Predicting Mortality in Intensive Care Unit Patients With Heart Failure Using an Interpretable Machine Learning Model: Retrospective Cohort Study

Li¹,

Liu²,

Hu³

et al. 2022

J Med Internet Res

Self Cite

View full text Add to dashboard Cite

Background Heart failure (HF) is a common disease and a major public health problem. HF mortality prediction is critical for developing individualized prevention and treatment plans. However, due to their lack of interpretability, most HF mortality prediction models have not yet reached clinical practice. Objective We aimed to develop an interpretable model to predict the mortality risk for patients with HF in intensive care units (ICUs) and used the SHapley Additive exPlanation (SHAP) method to explain the extreme gradient boosting (XGBoost) model and explore prognostic factors for HF. Methods In this retrospective cohort study, we achieved model development and performance comparison on the eICU Collaborative Research Database (eICU-CRD). We extracted data during the first 24 hours of each ICU admission, and the data set was randomly divided, with 70% used for model training and 30% used for model validation. The prediction performance of the XGBoost model was compared with three other machine learning models by the area under the curve. We used the SHAP method to explain the XGBoost model. Results A total of 2798 eligible patients with HF were included in the final cohort for this study. The observed in-hospital mortality of patients with HF was 9.97%. Comparatively, the XGBoost model had the highest predictive performance among four models with an area under the curve (AUC) of 0.824 (95% CI 0.7766-0.8708), whereas support vector machine had the poorest generalization ability (AUC=0.701, 95% CI 0.6433-0.7582). The decision curve showed that the net benefit of the XGBoost model surpassed those of other machine learning models at 10%~28% threshold probabilities. The SHAP method reveals the top 20 predictors of HF according to the importance ranking, and the average of the blood urea nitrogen was recognized as the most important predictor variable. Conclusions The interpretable predictive model helps physicians more accurately predict the mortality risk in ICU patients with HF, and therefore, provides better treatment plans and optimal resource allocation for their patients. In addition, the interpretable framework can increase the transparency of the model and facilitate understanding the reliability of the predictive model for the physicians.

show abstract

“…The variables used to predict the risk of hypoglycemia in patients with type 2 diabetes included various demographic, laboratory, and clinical variables, as well as EHR notes. The extraction of variables was based on experts' opinion and our research [16][17][18][19][20]. These variables were collected during the first 24 hours of admission.…”

Section: Variables Analyzedmentioning

confidence: 99%

Predicting Risk of Hypoglycemia in Patients With Type 2 Diabetes by Electronic Health Record–Based Machine Learning: Development and Validation

Yang¹,

Li²,

Liu³

et al. 2022

JMIR Med Inform

Self Cite

View full text Add to dashboard Cite

Background Hypoglycemia is a common adverse event in the treatment of diabetes. To efficiently cope with hypoglycemia, effective hypoglycemia prediction models need to be developed. Objective The aim of this study was to develop and validate machine learning models to predict the risk of hypoglycemia in adult patients with type 2 diabetes. Methods We used the electronic health records of all adult patients with type 2 diabetes admitted to West China Hospital between November 2019 and December 2021. The prediction model was developed based on XGBoost and natural language processing. F1 score, area under the receiver operating characteristic curve (AUC), and decision curve analysis (DCA) were used as the main criteria to evaluate model performance. Results We included 29,843 patients with type 2 diabetes, of whom 2804 patients (9.4%) developed hypoglycemia. In this study, the embedding machine learning model (XGBoost3) showed the best performance among all the models. The AUC and the accuracy of XGBoost are 0.82 and 0.93, respectively. The XGboost3 was also superior to other models in DCA. Conclusions The Paragraph Vector–Distributed Memory model can effectively extract features and improve the performance of the XGBoost model, which can then effectively predict hypoglycemia in patients with type 2 diabetes.

show abstract

Predicting in-hospital mortality in ICU patients with sepsis using gradient boosting decision tree

Cited by 34 publications

References 34 publications

Evaluating machine learning models for sepsis prediction: A systematic review of methodologies

Evaluating machine learning models for sepsis prediction: A systematic review of methodologies

Predicting Mortality in Intensive Care Unit Patients With Heart Failure Using an Interpretable Machine Learning Model: Retrospective Cohort Study

Predicting Risk of Hypoglycemia in Patients With Type 2 Diabetes by Electronic Health Record–Based Machine Learning: Development and Validation

Contact Info

Product

Resources

About