Purpose: The aim of this study was to assess racial/ethnic and socioeconomic disparities in the difference between atherosclerotic vascular disease prevalence measured by a multitask convolutional neural network (CNN) deep learning model using frontal chest radiographs (CXRs) and the prevalence reflected by administrative hierarchical condition category codes in two cohorts of patients with coronavirus disease 2019 (COVID-19). Methods: A CNN model, previously published, was trained to predict atherosclerotic disease from ambulatory frontal CXRs. The model was then validated on two cohorts of patients with COVID-19: 814 ambulatory patients from a suburban location (presenting from March 14, 2020, to October 24, 2020, the internal ambulatory cohort) and 485 hospitalized patients from an inner-city location (hospitalized from March 14, 2020, to August 12, 2020, the external hospitalized cohort). The CNN model predictions were validated against electronic health record administrative codes in both cohorts and assessed using the area under the receiver operating characteristic curve (AUC). The CXRs from the ambulatory cohort were also reviewed by two board-certified radiologists and compared with the CNN-predicted values for the same cohort to produce a receiver operating characteristic curve and the AUC. The atherosclerosis diagnosis discrepancy, D vasc , referring to the difference between the predicted value and presence or absence of the vascular disease HCC categorical code, was calculated. Linear regression was performed to determine the association of D vasc with the covariates of age, sex, race/ethnicity, language preference, and social deprivation index. Logistic regression was used to look for an association between the presence of any hierarchical condition category codes with D vasc and other covariates.
Results:The CNN prediction for vascular disease from frontal CXRs in the ambulatory cohort had an AUC of 0.85 (95% confidence interval, 0.82-0.89) and in the hospitalized cohort had an AUC of 0.69 (95% confidence interval, 0.64-0.75) against the electronic health record data. In the ambulatory cohort, the consensus radiologists' reading had an AUC of 0.89 (95% confidence interval, 0.86-0.92) relative to the CNN. Multivariate linear regression of D vasc in the ambulatory cohort demonstrated significant negative associations with non-English-language preference (b ¼ À0.083, P < .05) and Black or Hispanic race/ethnicity (b ¼ À0.048, P < .05) and positive associations with age (b ¼ 0.005, P < .001) and sex (b ¼ 0.044, P < .05). For the hospitalized cohort, age was also significant (b ¼ 0.003, P < .01), as was social deprivation index (b ¼ 0.002, P < .05). The D vasc variable (odds ratio [OR], 0.34), Black or Hispanic race/ethnicity (OR, 1.58), non-English-language preference (OR, 1.74), and site (OR, 0.22) were independent predictors of having one or more hierarchical condition category codes (P < .01 for all) in the combined patient cohort.