2022
DOI: 10.1038/s41598-021-04608-7
|View full text |Cite
|
Sign up to set email alerts
|

An explainable machine learning framework for lung cancer hospital length of stay prediction

Abstract: This work introduces a predictive Length of Stay (LOS) framework for lung cancer patients using machine learning (ML) models. The framework proposed to deal with imbalanced datasets for classification-based approaches using electronic healthcare records (EHR). We have utilized supervised ML methods to predict lung cancer inpatients LOS during ICU hospitalization using the MIMIC-III dataset. Random Forest (RF) Model outperformed other models and achieved predicted results during the three framework phases. With… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
37
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 91 publications
(40 citation statements)
references
References 37 publications
2
37
0
1
Order By: Relevance
“…It ended up with an R 2 score of 0.729. Alsinglawi et al [ 14 ] constructed a LOS prediction framework for lung cancer patients using RF and oversampling techniques (SMOTE and ADASYN). The framework gets an AUC score of 100% on the MIMIC-III dataset.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…It ended up with an R 2 score of 0.729. Alsinglawi et al [ 14 ] constructed a LOS prediction framework for lung cancer patients using RF and oversampling techniques (SMOTE and ADASYN). The framework gets an AUC score of 100% on the MIMIC-III dataset.…”
Section: Related Workmentioning
confidence: 99%
“…c 1 and c 2 are the mean values of the target feature corresponding to R 1 (A i , S) and R 2 (A i , S), respectively (13). e next step of the algorithm is to find which S can make the MSE of the feature minimum (14) and then use the segmentation point S together with the feature as the node of the tree. After the algorithm divides all features, the CART regression tree uses the average of all leaf nodes as the output ( 15) [42].…”
Section: Ridge Regressionmentioning
confidence: 99%
“…Although many studies have used EHR data, most of them have only used quantitative EHR data [ 8 , 9 , 10 , 11 ]. In fact, 80% of EHR data comprises semi-structured data such as patients’ physiological conditions (free-text notes and clinician progress notes) at the time of their visits [ 12 ].…”
Section: Introductionmentioning
confidence: 99%
“…We hypothesize that integrating H&E image data with other data modalities can improve risk stratification since clinical variables, mutation status, and gene expression profiles have individually been shown to be informative 23 . To address this question, we develop and evaluate integrative deep learning models that combine morphological features from H&E WSIs, clinical variables, MSI-status, and mutation status of key genes [24][25][26][27][28][29][30][31] .…”
Section: Introductionmentioning
confidence: 99%