Machine learning models for predicting acute kidney injury: a systematic review and critical appraisal

Vagliano, Iacopo; Chesnaye, Nick C; Leopold, Jan Hendrik; Jager, Kitty J.; Abu‐Hanna, Ameen; Schut, Martijn C

doi:10.1093/ckj/sfac181

Cited by 24 publications

(21 citation statements)

References 88 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…39 The DDA involves using statistical and computational techniques to obtain meaningful patterns and insights from complex datasets and using these ndings to inform the selection of appropriate ML algorithms, features, and hyperparameters. Although the MDA in uenced the ML development methodologies of most studies related to the prediction of AKI, 17,18 a combination of both architectures guided the project design and development of this study. The input features were identi ed based on literature review and consultation with kidney experts in our research team to guide the model developments.…”

Section: Discussionmentioning

confidence: 99%

“…12,15,16 The estimation of baseline sCr has been especially important in studies developing and validating machine learning (ML) models to predict AKI. 17,18 In general, these studies have used different methods of estimating the baseline sCr to establish the ground truth in order to label positive AKI occurrences. This can lead to discrepancies in how AKI events are identi ed and labelled, making direct comparisons of the models' performance metrics challenging.…”

Section: Introductionmentioning

confidence: 99%

“…The most recent study published in 2022, which reviewed the application of ML models to predict AKI, has indicated that only 25% of the included studies, reported the precision metric which evaluates the proportion of positive predictions made by the classi er that are correct. 17 Therefore, for the remaining studies, it is not possible to validate the false positive (FP) rates. An appraisal of studies with reported precision revealed a relatively high number of FP predictions.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Machine Learning Decision Support Systems for Predicting Acute Kidney Injury: Improving Precision to improve patient outcomes

Rahimi

Ghadimi

Pole

et al. 2023

Preprint

View full text Add to dashboard Cite

Background There are many machine learning (ML) models which predict acute kidney injury (AKI) for hospitalised patients. While a primary goal of these models is to support clinicians with better decision-making in hospitals, the adoption of different methods of estimating baseline serum creatinine (sCr) can result in establishing inconsistent ground truth when estimating AKI incidence. The real-world utility of such models is therefore often an issue given the high rate of false positive predictions which can result in negative clinical outcomes. Objective The first aim of this study was to develop and assess the performance of ML models using three different methods of estimating baseline sCr. The second aim was to conduct an error analysis to reduce the rate of false positives. Materials and Methods For both aims, the Intensive Care Unit (ICU) patients of the Medical Information Mart for Intensive Care (MIMIC)-IV dataset with the KDIGO (Kidney Disease Improving Global Outcome) definition was used to identify AKI episodes using three different methods of estimating baseline sCr. ML models were developed for each cohort and the performance of the models was compared. Explainability methods were used to analyse the XGBoost errors. Results The baseline, defined as the mean of sCr in 180 to 7 days prior to ICU, yielded the highest performance metrics with the XGBoost model. Using the explainability methods, the mean of sCr in 180 to 0 days pre-ICU led to a further reduction in FP rate, with the highest AUC of 0.86, recall of 0.61, precision of 0.56 and f1 score of 0.58. The cohort size was 31,586 admissions, of which 5,473 (17.32%) had AKI. Conclusion To enable the effective use of AI in AKI prediction and management, a clinically relevant and widely applicable standard method for baseline sCr is needed. In healthcare, the utilisation of explainability techniques can aid AI developers and end users in comprehending how AI models are making predictions. We concluded that ML development with model-driven and data-driven architectures can be effective in minimizing the occurrence of false positives. This can augment the success rate of ML implementation in routine care.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Machine Learning Decision Support Systems for Predicting Acute Kidney Injury: Improving Precision to improve patient outcomes

Rahimi

Ghadimi

Pole

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…The infrequent external validation of ML models for the prediction of acute events in the ICU was already noted in a 2019 systematic review, with only 7% of studies at the time using geographically independent data for model validation [8]. This has been echoed in more recent, disease-specific reviews looking at models for sepsis [20] and acute kidney injury [42]. While we showed that this percentage has somewhat improved since, we also find that challenges remain even if external validation is performed.…”

Section: Discussionmentioning

confidence: 99%

Generalisability of AI-based scoring systems in the ICU: a systematic review and meta-analysis

Rockenschaub,

Akay,

Carlisle

et al. 2023

Preprint

View full text Add to dashboard Cite

Background: Machine learning (ML) is increasingly used to predict clinical deterioration in intensive care unit (ICU) patients through scoring systems. Although promising, such algorithms often overfit their training cohort and perform worse at new hospitals. Thus, external validation is a critical but frequently overlooked step to establish the reliability of predicted risk scores to translate them into clinical practice. We systematically reviewed how regularly external validation of ML-based risk scores is performed and how their performance changed in external data. Methods: We searched MEDLINE, Web of Science, and arXiv for studies using ML to predict deterioration of ICU patients from routine data. We included primary research published in English before April 2022. We summarised how many studies were externally validated, assessing differences over time, by outcome, and by data source. For validated studies, we evaluated the change in area under the receiver operating characteristic (AUROC) attributable to external validation using linear mixed-effects models. Results: We included 355 studies, of which 39 (11.0%) were externally validated, increasing to 17.9% by 2022. Validated studies made disproportionate use of open-source data, with two well-known US datasets (MIMIC and eICU) accounting for 79.5% of studies. On average, AUROC was reduced by -0.037 (95% CI -0.064 to -0.017) in external data, with >0.05 reduction in 38.6% of studies. Discussion: External validation, although increasing, remains uncommon. Performance was generally lower in external data, questioning the reliability of some recently proposed ML-based scores. Interpretation of the results was challenged by an overreliance on the same few datasets, implicit differences in case mix, and exclusive use of AUROC.

show abstract

“…Both AKI and sepsis are also highly heterogeneous [21]. This makes models built with conventional FL strategies such as federated averaging challenging to generalize across clinics, limiting their use [7,22,23]. Several federated architectures have been proposed to mitigate effects of data heterogeneity in other domains and built personalized, but globally correlated, models to mitigate drift across sites [23], such as model-agnostic meta-learning (MAML), federated multitask learning, and knowledge distillation [24][25][26][27][28].…”

Section: Introductionmentioning

confidence: 99%

Data heterogeneity in federated learning with Electronic Health Records: Case studies of risk prediction for acute kidney injury and sepsis diseases in critical care

Rajendran

Pan

et al. 2023

PLOS Digit Health

View full text Add to dashboard Cite

With the wider availability of healthcare data such as Electronic Health Records (EHR), more and more data-driven based approaches have been proposed to improve the quality-of-care delivery. Predictive modeling, which aims at building computational models for predicting clinical risk, is a popular research topic in healthcare analytics. However, concerns about privacy of healthcare data may hinder the development of effective predictive models that are generalizable because this often requires rich diverse data from multiple clinical institutions. Recently, federated learning (FL) has demonstrated promise in addressing this concern. However, data heterogeneity from different local participating sites may affect prediction performance of federated models. Due to acute kidney injury (AKI) and sepsis’ high prevalence among patients admitted to intensive care units (ICU), the early prediction of these conditions based on AI is an important topic in critical care medicine. In this study, we take AKI and sepsis onset risk prediction in ICU as two examples to explore the impact of data heterogeneity in the FL framework as well as compare performances across frameworks. We built predictive models based on local, pooled, and FL frameworks using EHR data across multiple hospitals. The local framework only used data from each site itself. The pooled framework combined data from all sites. In the FL framework, each local site did not have access to other sites’ data. A model was updated locally, and its parameters were shared to a central aggregator, which was used to update the federated model’s parameters and then subsequently, shared with each site. We found models built within a FL framework outperformed local counterparts. Then, we analyzed variable importance discrepancies across sites and frameworks. Finally, we explored potential sources of the heterogeneity within the EHR data. The different distributions of demographic profiles, medication use, and site information contributed to data heterogeneity.

show abstract

Machine learning models for predicting acute kidney injury: a systematic review and critical appraisal

Cited by 24 publications

References 88 publications

Machine Learning Decision Support Systems for Predicting Acute Kidney Injury: Improving Precision to improve patient outcomes

Machine Learning Decision Support Systems for Predicting Acute Kidney Injury: Improving Precision to improve patient outcomes

Generalisability of AI-based scoring systems in the ICU: a systematic review and meta-analysis

Data heterogeneity in federated learning with Electronic Health Records: Case studies of risk prediction for acute kidney injury and sepsis diseases in critical care

Contact Info

Product

Resources

About