Importance: Despite sex and race disparities in the symptom presentation, diagnosis, and management of acute coronary syndrome (ACS), these differences have not been investigated in the development and validation of machine learning (ML) models using individualized patient information from electronic health records (EHRs) to diagnose ACS. Objective: To evaluate ML-based ACS diagnosis performance across different subpopulations in a multi-site emergency department (ED) setting and determine how bias mitigating techniques influence ML performance. Design, Setting, and Participants: This retrospective observational study included data from 2,334,316 ED patients ( >18 years) from January 2007 to June 2020. Exposure: Logistic regression (LR) and neural network (NN) models were assessed in ED encounters grouped by sex, race, presence or absence of chest pain, EHR data quality, and timeliness of several key ED procedures. Prejudice regularization, reweighting, and within-subpopulation training were evaluated for bias mitigation. Main Outcomes/Measures: Metrics including area under the receiver operating characteristic (AUROC) were used to assess performances. Results: We analyzed 4,268,165 ED visits in which patient demographics by race were 67.40% White, 19.20% Black, 2.40% Asian, and 11.00% Other or Unknown. Patient composition was 54.80% female and 45.20% male. Both models’ AUROCs were significantly higher in White vs. Black patients (LR: z-score = 3.23 and NN: 4.26 for NN; P < 0.0006), in males vs. females (z-score = 3.81 for LR and 4.16 for NN; P < 0.0001) and in no chest pain subpopulation vs. chest pain (z-score = 13.32 for LR and 17.70 for NN; P < 0.0001). Prejudice regularization and reweighting techniques did not reduce biases. Training in race-specific and sex-specific training populations also did not yeild statistically signficant improvements in ML algorithm performance. Chest pain-specific training led to significantly improved AUROC.Conclusion: EHR-derived ML models trained and tested within similar demographic subpopulations and symptom groups may perform better than ML models that are trained in random populations, and provide less biased clinical decision support for ACS diagnosis.
Background Electronic health records (EHRs) contain individualized patient data that can be used to develop diagnostic and risk prediction models with artificial intelligence (AI) algorithms. Explicit and implicit sources of bias embedded in EHRs may hinder generalizable model performance and perpetuate bias. Objective This study explores the question of how existing sex and racial disparities in the clinical assessment and management of acute coronary syndrome (ACS) are reflected in patients’ EHRs. We outlined recommendations on how these sources of bias should be examined and mitigated within the framework of development and validation of AI-based diagnosis and risk stratification algorithms. Methods This retrospective study examined several previously unrecognized EHR-embedded biases in a multisite emergency departments (ED) setting. We assessed sex and race differences in EHR data missingness and timeliness of several key ED procedures following ACS suspicion. We additionally conducted a data-driven clustering analysis to detect latent groups of ACS patients. Results 10,043 ACS-associated ED visits were included. We identified sex and race differences in the prevalence of ACS symptoms, data missingness (vitals and labs), and waiting time for troponin order and treatment initiation within 24 hours post-admission. Our cluster analysis discovered four groups of ACS patients corresponding to common symptoms. Additionally, we identified differences in clinical management and patient demographics across these clusters. Conclusions We discovered several sources of bias inherent in the clinical practice of ACS diagnosis and treatment that are also present in EHR data. Our study supports the inclusion of a wider range of symptoms into AI-based diagnosis and risk stratification tools. These models should be also validated in ED admissions without chest pain and in patients who experience a longer waiting time for ED procedures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.