An interpretable outcome prediction model based on electronic health records and hierarchical attention

Du, Juan; Zeng, Dajian; Zhao, Li; Liu, Jingxuan; Lv, Mingqi; Chen, Ling; Zhang, Dan; Ji, Shouling

doi:10.1002/int.22697

Cited by 10 publications

(5 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We investigated the risk factors identified by the model by analyzing which predictors most attribute to the model’s prediction. We used the attention scores [32, 33] obtained from the LSTM cells, as a way to determine which features are given more attention (importance) by the model to predict the output (more details are provided in Supplementary D.1).…”

Section: Experiments and Resultsmentioning

confidence: 99%

Reliable prediction of childhood obesity using only routinely collected EHRs is possible

Gupta,

Phan,

Eckrich

et al. 2024

Preprint

View full text Add to dashboard Cite

ObjectiveIdentifying children at high risk of developing obesity can offer a critical time to change the course of the disease before it establishes. Numerous studies have tried to achieve this; but practical limitations remain, including (i) relying on data not present in routinely available pediatric data (like prenatal data), (ii) focusing on a single age prediction (hence, not tested across ages), and (iii) not achieving good results or adequately validating those.MethodsA customized sequential deep learning model was built to predict the risk of childhood obesity, focusing especially on capturing the temporal patterns. The model was trained only on routinely collected EHRs, containing a list of features identified by a group of clinical experts, and sourced from 36,191 diverse children aged 0 to 10. The model was evaluated using extensive discrimination, calibration, and utility analysis; and was validated temporally, geographically, and across various subgroups.ResultsOur results are mostly better (and never worse) than all previous studies, including those that focus on single-age predictions or link EHRs to external data. Specifically, the model consistently achieved an area under the curve (AUROC) of above 0.8 (with most cases around 0.9) for predicting obesity within the next 3 years for children 2 to 7. The validation results show the robustness of the model. Furthermore, the most influential predictors of the model match important risk factors of obesity.ConclusionsOur model is able to predict the risk of obesity for young children using only routinely collected EHR data, greatly facilitating its integration with the periodicity schedule. The model can serve as an objective screening tool to inform prevention efforts, especially by helping with very delicate interactions between providers and families in primary care settings.

show abstract

Section: Experiments and Resultsmentioning

confidence: 99%

Reliable prediction of childhood obesity using only routinely collected EHRs is possible

Gupta,

Phan,

Eckrich

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

“…Self‐attention has been shown to perform better than sequence‐based models such as bidirectional gated recurrent unit (GRU) with attention (Bi‐GRU‐ATT) when dealing with a sparse dataset (Wang et al, 2021). Hierarchical attention, an improvement over the attention mechanism, has been applied in document classification (Yang et al, 2016), detecting financial fraud (Craja et al, 2020), health outcome prediction (Du et al, 2022), and sentiment analysis (Wang et al, 2021). Yang et al (2016) applied attention at two semantic hierarchies, namely, word and sentence level, but did not consider self‐attention and a mechanism to capture long‐term dependencies, unlike this work.…”

Section: Literature Review and Research Gapmentioning

confidence: 99%

A deep learning model for online doctor rating prediction

Kulshrestha

Krishnaswamy

Sharma

2023

Journal of Forecasting

View full text Add to dashboard Cite

Predicting doctor ratings is a critical task in the healthcare industry. A patient usually provides ratings to a few doctors only, leading to the data sparsity issue, which complicates the rating prediction task. The study attempts to improve the prediction methodologies used in the doctor rating prediction systems. The study proposes a novel deep learning (DL) model for online doctor rating prediction based on a hierarchical attention bidirectional long short-term memory (ODRP-HABiLSTM) network. A hierarchical self-attention bidirectional long short-term memory (HA-BiLSTM) network incorporates a textual review's word and sentence level information. A highway network is used to refine the representations learned by BiLSTM. The resulting latent patient and doctor representations are utilized to predict the online doctor ratings. Experimental findings based on real-world doctor reviews from Yelp.com across two medical specialties demonstrate the proposed model's superior performance over state-of-the-art benchmark models. In addition, robustness analysis is used to strengthen the findings.

show abstract

Section: Introductionmentioning

confidence: 99%

“…With the worldwide adoption of electronic health record (EHR) systems, machine learning has made great strides in the secondary use of EHR data toward more accurate clinical risk prediction. [1][2][3][4][5] These studies have a fundamental assumption that the data distributions of the training and test sets are the same, and thus prediction model development is typically a onetime activity. However, clinical practices such as patient care and hospital conditions can change over time, and disease prevalence and cause can also change over time 6 ; both cases would lead to changes in the data distributions, resulting in model performance drift.…”

Section: Introductionmentioning

confidence: 99%

“…Compared with full retraining, adapting the current model to a new environment requires a more complex learning paradigm, such as online learning 11 and lifelong learning, 12 both of which are prominent for dynamic environments or concept drift scenarios, while lifelong learning is superior to online learning in preserving old knowledge against being overwritten by new knowledge. 12 However, they are not suitable for EHR-based clinical prediction modeling because (1) they are designed to tackle streaming data (such as stock price, sensor, and other time-series data), that is, learn from a sequence of data instances one by one at each time and continually adapt the current model to the new environment, and (2) most EHRbased modeling studies are based on cross-sectional and longitudinal data of irregular time points.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A hybrid adaptive approach for instance transfer learning with dynamic and imbalanced data

Zhang

Liu

Yuan

et al. 2022

Int J of Intelligent Sys

View full text Add to dashboard Cite

Machine learning has demonstrated success in clinical risk prediction modeling with complex electronic health record (EHR) data. However, the evolving nature of clinical practices can dynamically change the underlying data distribution over time, leading to model performance drift. Adopting an outdated model is potentially risky and may result in unintentional losses. In this paper, we propose a novel Hybrid Adaptive Boosting approach (HA‐Boost) for transfer learning. HA‐Boost is characterized by the domain similarity‐based and class imbalance‐based adaptation mechanisms, which simultaneously address two critical limitations of the classical TrAdaBoost algorithm. We validated HA‐Boost in predicting hospital‐acquired acute kidney injury using real‐world longitudinal EHRs data. The experiment results demonstrate that HA‐Boost stably outperforms the competing baselines in terms of both Area Under Receiver Operating Characteristic and Area Under Precision‐Recall Curve across a 7‐year time span. This study has confirmed the effectiveness of transfer learning as a superior model updating approach in a dynamic environment.

show abstract

An interpretable outcome prediction model based on electronic health records and hierarchical attention

Cited by 10 publications

References 33 publications

Reliable prediction of childhood obesity using only routinely collected EHRs is possible

Reliable prediction of childhood obesity using only routinely collected EHRs is possible

A deep learning model for online doctor rating prediction

A hybrid adaptive approach for instance transfer learning with dynamic and imbalanced data

Contact Info

Product

Resources

About