Background:
Atrial fibrillation (AF) is associated with substantial morbidity, especially when it goes undetected. If new-onset AF could be predicted, targeted screening could be used to find it early. We hypothesized that a deep neural network could predict new-onset AF from the resting 12-lead ECG and that this prediction may help identify those at risk of AF-related stroke.
Methods:
We used 1.6 M resting 12-lead digital ECG traces from 430 000 patients collected from 1984 to 2019. Deep neural networks were trained to predict new-onset AF (within 1 year) in patients without a history of AF. Performance was evaluated using areas under the receiver operating characteristic curve and precision-recall curve. We performed an incidence-free survival analysis for a period of 30 years following the ECG stratified by model predictions. To simulate real-world deployment, we trained a separate model using all ECGs before 2010 and evaluated model performance on a test set of ECGs from 2010 through 2014 that were linked to our stroke registry. We identified the patients at risk for AF-related stroke among those predicted to be high risk for AF by the model at different prediction thresholds.
Results:
The area under the receiver operating characteristic curve and area under the precision-recall curve were 0.85 and 0.22, respectively, for predicting new-onset AF within 1 year of an ECG. The hazard ratio for the predicted high- versus low-risk groups over a 30-year span was 7.2 (95% CI, 6.9–7.6). In a simulated deployment scenario, the model predicted new-onset AF at 1 year with a sensitivity of 69% and specificity of 81%. The number needed to screen to find 1 new case of AF was 9. This model predicted patients at high risk for new-onset AF in 62% of all patients who experienced an AF-related stroke within 3 years of the index ECG.
Conclusions:
Deep learning can predict new-onset AF from the 12-lead ECG in patients with no previous history of AF. This prediction may help identify patients at risk for AF-related strokes.
Background:
Timely diagnosis of structural heart disease improves patient outcomes, yet many remain underdiagnosed. While population screening with echocardiography is impractical, electrocardiogram (ECG)-based prediction models can help target high-risk patients. We developed a novel ECG-based machine learning approach to predict multiple structural heart conditions, hypothesizing that a composite model would yield higher prevalence and positive predictive values (PPVs) to facilitate meaningful recommendations for echocardiography.
Methods:
Using 2,232,130 ECGs linked to electronic health records and echocardiography reports from 484,765 adults between 1984-2021, we trained machine learning models to predict the presence or absence of any of seven echocardiography-confirmed diseases within one year. This composite label included: moderate or severe valvular disease (aortic/mitral stenosis or regurgitation, tricuspid regurgitation), reduced ejection fraction <50%, or interventricular septal thickness >15mm. We tested various combinations of input features (demographics, labs, structured ECG data, ECG traces) and evaluated model performance using 5-fold cross-validation, multi-site validation trained on one site and tested on 10 independent sites, and simulated retrospective deployment trained on pre-2010 data and deployed in 2010.
Results:
Our composite 'rECHOmmend' model using age, sex and ECG traces had an area under the receiver operating characteristic curve (AUROC) of 0.91 and PPV of 42% at 90% sensitivity, with a composite label prevalence of 17.9%. Individual disease models had AUROCs from 0.86-0.93 and lower PPVs from 1%-31%. AUROCs for models using different input features ranged from 0.80-0.93, increasing with additional features. Multi-site validation showed similar results to cross-validation, with an aggregate AUROC of 0.91 across our independent test set of 10 clinical sites after training on a separate site. Our simulated retrospective deployment showed that for ECGs acquired in patients without pre-existing structural heart disease in the year 2010, 11% were classified as high-risk, of which 41% (4.5% of total patients) developed true echocardiography-confirmed disease within one year.
Conclusions:
An ECG-based machine learning model using a composite endpoint can identify a high-risk population for having undiagnosed, clinically significant structural heart disease while outperforming single disease models and improving practical utility with higher PPVs. This approach can facilitate targeted screening with echocardiography to improve under-diagnosis of structural heart disease.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.