The automation of bias in medical Artificial Intelligence (AI): Decoding the past to create a better future

Straw, Isabel

doi:10.1016/j.artmed.2020.101965

Cited by 52 publications

(37 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Deep learning (DL) has been extensively documented to propagate health care disparities and biases, mostly through the use of biased training data, limiting its generalizability [11]. Conversely, it is possible to use DL algorithms to detect such disparities.…”

Section: Introductionmentioning

confidence: 99%

Detecting Racial/Ethnic Health Disparities Using Deep Learning From Frontal Chest Radiography

Pyrros

Rodríguez-Fernández

Borstelmann

et al. 2022

Journal of the American College of Radiology

View full text Add to dashboard Cite

Purpose: The aim of this study was to assess racial/ethnic and socioeconomic disparities in the difference between atherosclerotic vascular disease prevalence measured by a multitask convolutional neural network (CNN) deep learning model using frontal chest radiographs (CXRs) and the prevalence reflected by administrative hierarchical condition category codes in two cohorts of patients with coronavirus disease 2019 (COVID-19). Methods: A CNN model, previously published, was trained to predict atherosclerotic disease from ambulatory frontal CXRs. The model was then validated on two cohorts of patients with COVID-19: 814 ambulatory patients from a suburban location (presenting from March 14, 2020, to October 24, 2020, the internal ambulatory cohort) and 485 hospitalized patients from an inner-city location (hospitalized from March 14, 2020, to August 12, 2020, the external hospitalized cohort). The CNN model predictions were validated against electronic health record administrative codes in both cohorts and assessed using the area under the receiver operating characteristic curve (AUC). The CXRs from the ambulatory cohort were also reviewed by two board-certified radiologists and compared with the CNN-predicted values for the same cohort to produce a receiver operating characteristic curve and the AUC. The atherosclerosis diagnosis discrepancy, D vasc , referring to the difference between the predicted value and presence or absence of the vascular disease HCC categorical code, was calculated. Linear regression was performed to determine the association of D vasc with the covariates of age, sex, race/ethnicity, language preference, and social deprivation index. Logistic regression was used to look for an association between the presence of any hierarchical condition category codes with D vasc and other covariates. Results:The CNN prediction for vascular disease from frontal CXRs in the ambulatory cohort had an AUC of 0.85 (95% confidence interval, 0.82-0.89) and in the hospitalized cohort had an AUC of 0.69 (95% confidence interval, 0.64-0.75) against the electronic health record data. In the ambulatory cohort, the consensus radiologists' reading had an AUC of 0.89 (95% confidence interval, 0.86-0.92) relative to the CNN. Multivariate linear regression of D vasc in the ambulatory cohort demonstrated significant negative associations with non-English-language preference (b ¼ À0.083, P < .05) and Black or Hispanic race/ethnicity (b ¼ À0.048, P < .05) and positive associations with age (b ¼ 0.005, P < .001) and sex (b ¼ 0.044, P < .05). For the hospitalized cohort, age was also significant (b ¼ 0.003, P < .01), as was social deprivation index (b ¼ 0.002, P < .05). The D vasc variable (odds ratio [OR], 0.34), Black or Hispanic race/ethnicity (OR, 1.58), non-English-language preference (OR, 1.74), and site (OR, 0.22) were independent predictors of having one or more hierarchical condition category codes (P < .01 for all) in the combined patient cohort.

show abstract

Section: Introductionmentioning

confidence: 99%

Detecting Racial/Ethnic Health Disparities Using Deep Learning From Frontal Chest Radiography

Pyrros

Rodríguez-Fernández

Borstelmann

et al. 2022

Journal of the American College of Radiology

View full text Add to dashboard Cite

show abstract

“…The author proposes, going forward, to decode the present and reshape existing practices before implementing AI to avoid existing biases and further increasing health disparities. 59 Colling et al propose a UK-wide strategy for AI and DP. If the requirements of proper slide image management software, integrated reporting systems, improved scanning speeds, and high-quality images for DP systems are achieved then it will provide time and cost saving benefits over the traditional microscope based pathology approach and reduce problem of inter-observer variation.…”

Section: Ai -Issues To Be Resolvedmentioning

confidence: 99%

Artificial Intelligence (AI) in Pathology – A Summary and Challenges

Buch¹,

Kulkarni²

2021

GJMR

View full text Add to dashboard Cite

This bibliographic study covers Artificial Intelligence (AI)theory and its applications from the healthcare field and in particular from the discipline of pathology. This review includes basics of AI, supervised and unsupervised machine learning (ML), various supervised ML algorithms, and their applications in healthcare and pathology. Digital Pathology with Deep Machine Learning is more advantageous over traditional pathology that is based on ‘physical slide on a physical microscope’. However, various implementation challenges of cost, data quality, multi-center validation, bias, and regulatory approval issues for AI in clinical practice still remain, which are also described in this study.

show abstract

“…In 2021, the UK parliamentary report on the gender health gap highlighted that the UK has the largest female health gap in the G20 and the 12th largest globally 5. The exclusion of females from research trials (extending to animal research), the neglect of female bodies throughout medical pedagogy and the unconscious biases of practitioners are a few of the intersecting factors that result in worse health outcomes for female patients 6–10…”

Section: Introductionmentioning

confidence: 99%

“…These ‘biochemical markers’ include proteins made by the liver (eg, albumin), and enzymes required for metabolism (eg, aspartate aminotransferase (AST)). Bias research has illustrated that biochemical markers are not equally effective for all patient groups 3 7 10–12. Suthahar et al describe how sex differences in biomarker thresholds affect objectivity in management, as what is considered ‘normal’ in one sex, may not be so in the other 12.…”

Section: Introductionmentioning

confidence: 99%

“…Furthermore, Vatsalya et al and Stepien et al describe sex differences in biochemical cut offs, highlighting that the milder expression of liver injury for females may result in female disease going undetected 3 13. Such disparities in the predictive potential of clinical biomarkers have the potential to exacerbate healthcare inequalities 6 7 10 12…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Investigating for bias in healthcare algorithms: a sex-stratified analysis of supervised machine learning models in liver disease prediction

Straw

2022

BMJ Health Care Inform

Self Cite

View full text Add to dashboard Cite

ObjectivesThe Indian Liver Patient Dataset (ILPD) is used extensively to create algorithms that predict liver disease. Given the existing research describing demographic inequities in liver disease diagnosis and management, these algorithms require scrutiny for potential biases. We address this overlooked issue by investigating ILPD models for sex bias.MethodsFollowing our literature review of ILPD papers, the models reported in existing studies are recreated and then interrogated for bias. We define four experiments, training on sex-unbalanced/balanced data, with and without feature selection. We build random forests (RFs), support vector machines (SVMs), Gaussian Naïve Bayes and logistic regression (LR) classifiers, running experiments 100 times, reporting average results with SD.ResultsWe reproduce published models achieving accuracies of >70% (LR 71.31% (2.37 SD) – SVM 79.40% (2.50 SD)) and demonstrate a previously unobserved performance disparity. Across all classifiers females suffer from a higher false negative rate (FNR). Presently, RF and LR classifiers are reported as the most effective models, yet in our experiments they demonstrate the greatest FNR disparity (RF; −21.02%; LR; −24.07%).DiscussionWe demonstrate a sex disparity that exists in published ILPD classifiers. In practice, the higher FNR for females would manifest as increased rates of missed diagnosis for female patients and a consequent lack of appropriate care. Our study demonstrates that evaluating biases in the initial stages of machine learning can provide insights into inequalities in current clinical practice, reveal pathophysiological differences between the male and females, and can mitigate the digitisation of inequalities into algorithmic systems.ConclusionOur findings are important to medical data scientists, clinicians and policy-makers involved in the implementation medical artificial intelligence systems. An awareness of the potential biases of these systems is essential in preventing the digital exacerbation of healthcare inequalities.

show abstract

The automation of bias in medical Artificial Intelligence (AI): Decoding the past to create a better future

Cited by 52 publications

References 6 publications

Detecting Racial/Ethnic Health Disparities Using Deep Learning From Frontal Chest Radiography

Detecting Racial/Ethnic Health Disparities Using Deep Learning From Frontal Chest Radiography

Artificial Intelligence (AI) in Pathology – A Summary and Challenges

Investigating for bias in healthcare algorithms: a sex-stratified analysis of supervised machine learning models in liver disease prediction

Contact Info

Product

Resources

About