Distilling Knowledge from Publicly Available Online EMR Data to Emerging Epidemic for Prognosis

Ma, Liwei; Ma, Xinyu; Gao, Jianliang; Jiao, Xianfeng; Yu, Zhihao; Zhang, Chaohe; Ruan, Wenjie; Wang, Yasha; Tang, Wen; Wang, Jiangtao

doi:10.1145/3442381.3449855

Cited by 21 publications

(15 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For general COVID-19 related clinical tasks such as diagnosis (Feng et al, 2020;Zoabi et al, 2021), length of stay (Dan et al, 2020;Ma et al, 2021), or severity risk (Jamshidi et al, 2022;Wynants et al, 2020;Yan et al, 2020) The integration of domain knowledge into machine learning models is another ongoing research topic. Most existing works extract domain knowledge from medical literature or clinical concept hierarchies such as ICD codes.…”

Section: Methods Details Backgroundmentioning

confidence: 99%

“…Finally, we evaluate the models on the testing set and report the prediction performance. The evaluation metrics are AUROC (area under the receiver operating characteristic curve), AUPRC (area under the precision-recall curve), and the maximum of the minimum between precision and sensitivity under the same threshold, known as Min(Re,Pr), following existing works (Ma et al, 2020(Ma et al, , 2021. An example of Min(Re,Pr) is shown in Figure 3, which evaluates whether the model can achieve the balance between precision and recall.…”

Section: Cohort Construction and Evaluation Metricsmentioning

confidence: 99%

“…Most are designed with limited features and are more likely to perform poorly on large heterogeneous datasets. For example,Ma et al (2021) conducted length-of-stay prediction for COVID-19 patients from the HM Hospitals in Spain using a total of 66 lab test features for each patient,ArticleandYan et al (2020) conducted mortality prediction for COVID-19 patients from the Tongji Hospital in China using 74 lab test features. Most of these methods also have limited explainability due to their use of ''black box'' deep neural networks limiting their clinical utility.…”

mentioning

confidence: 99%

See 2 more Smart Citations

MedML: Fusing medical knowledge and machine learning models for early pediatric COVID-19 hospitalization and severity prediction

Gao¹,

Yang²,

Heintz³

et al. 2022

iScience

Self Cite

View full text Add to dashboard Cite

Section: Methods Details Backgroundmentioning

confidence: 99%

Section: Cohort Construction and Evaluation Metricsmentioning

confidence: 99%

mentioning

confidence: 99%

See 1 more Smart Citation

MedML: Fusing medical knowledge and machine learning models for early pediatric COVID-19 hospitalization and severity prediction

Gao¹,

Yang²,

Heintz³

et al. 2022

iScience

Self Cite

View full text Add to dashboard Cite

“…In the past two years, many machine learning and deep learning models have been proposed to conduct COVID-19 clinical prediction tasks, including diagnosis prediction 13,14 , length-of-stay prediction 15,16 , severity and mortality prediction [3][4][5][6][7][8][9][10][11][12] , etc. Yan et al 3 conducted mortality prediction for COVID-19 patients from the Tongji Hospital in China.…”

Section: Covid-19 Predictive Modeling Using Ehr Datamentioning

confidence: 99%

“…Gao et al 12 used deep learning and tree-based models to predict COVID-19 severity and hospitalization risks. Ma et al 15 conducted length-of-stay prediction for hospitalized COVID-19 patients from the HM Hospitals in Spain. Though these works have achieved good prediction performance on their own data, these different models are applied to different datasets, and most of them are not publicly available or have strict access restrictions.…”

Section: Covid-19 Predictive Modeling Using Ehr Datamentioning

confidence: 99%

A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive Care: Choosing the Best Model for COVID-19 Prognosis

Gao¹,

Zhu²,

Wang³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Objective:The COVID-19 pandemic has posed a heavy burden to the healthcare system worldwide and caused huge social disruption and economic loss. Many deep learning models have been proposed to conduct clinical predictive tasks such as mortality prediction for COVID-19 patients in intensive care units using Electronic Health Record (EHR) data. Despite their initial success in certain clinical applications, there is currently a lack of benchmarking results to achieve a fair comparison so that we can select the optimal model for clinical use. Furthermore, there is a discrepancy between the formulation of traditional prediction tasks and real-world clinical practice in intensive care. Methods: To fill these gaps, we propose two clinical prediction tasks, Outcome-specific length-of-stay prediction and Early mortality prediction for COVID-19 patients in intensive care units. The two tasks are adapted from the naive length-of-stay and mortality prediction tasks to accommodate the clinical practice for COVID-19 patients. We propose fair, detailed, open-source data-preprocessing pipelines and evaluate 17 state-of-the-art predictive models on two tasks, including 5 machine learning models, 6 basic deep learning models and 6 deep learning predictive models specifically designed for EHR data. Results: We provide benchmarking results using data from two real-world COVID-19 EHR datasets. Both datasets are publicly available without needing any inquiry and one dataset can be accessed on request. We provide fair, reproducible benchmarking results for two tasks. Conclusions:We deploy all experiment results and models on an online platform. We also allow clinicians and researchers to upload their data to the platform and get quick prediction results using our trained models. We hope our efforts can further facilitate deep learning and machine learning research for COVID-19 predictive modeling. Software Repository: https://github.com/yhzhu99/covid-ehr-benchmarks IntroductionThe COVID-19 pandemic needs no introduction. As of May 2022, the virus has caused over 500 million infected cases and over 6 million deaths 1 . Though research shows that new variants of COVID-19 are less deadly, they are more spreadable and cause the number of cases still surging globally 2 . Under current circumstances, achieving early risk prediction and estimating the disease progression especially for COVID-19 patients in intensive care units have been an important topic to allocate limited medical resources and relieve the burdens of our healthcare system.Electronic health record (EHR) data and intelligent models have been viable solutions to solve this challenge. Many machine learning and deep learning models have been proposed to utilize COVID-19 patients' EHR data to conduct clinical prediction tasks including severity 3-12 , diagnosis 13, 14 , length-of-stay (LOS) 15, 16 , etc. There are more previous general EHR predictive models, which can also be applied to COVID-19 prediction tasks. These works achieve better prediction performances compared with ...

show abstract

Contrastive Learning-Based Imputation-Prediction Networks for In-hospital Mortality Risk Modeling Using EHRs

Liu,

Zhang,

Qin

et al. 2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Distilling Knowledge from Publicly Available Online EMR Data to Emerging Epidemic for Prognosis

Cited by 21 publications

References 36 publications

MedML: Fusing medical knowledge and machine learning models for early pediatric COVID-19 hospitalization and severity prediction

MedML: Fusing medical knowledge and machine learning models for early pediatric COVID-19 hospitalization and severity prediction

A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive Care: Choosing the Best Model for COVID-19 Prognosis

Contrastive Learning-Based Imputation-Prediction Networks for In-hospital Mortality Risk Modeling Using EHRs

Contact Info

Product

Resources

About