Association of Artificial Intelligence–Aided Chest Radiograph Interpretation With Reader Performance and Efficiency

Ahn, Jong Seok; Ebrahimian, Shadi; McDermott, Shaunagh; Lee, Sang-Hyup; Naccarato, Laura; Capua, John F. Di; Wu, Markus; Zhang, Eric W.; Muse, Victorine V.; Miller, Benjamin F.; Sabzalipour, Farid; Bizzo, Bernardo C.; Dreyer, Keith J.; Kaviani, Parisa; Digumarthy, Subba R.; Kalra, Mannudeep K.

doi:10.1001/jamanetworkopen.2022.29289

Cited by 58 publications

(46 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The problem is that most of the work on assessing the diagnostic accuracy of AI algorithms for CXR indicates metrics obtained by developers on limited datasets in the so-called “laboratory conditions”. As can be seen from recent studies [ 12 , 15 ], the metrics obtained in this way look attractive for the subsequent implementation of such algorithms in clinical practice. Will AI for CXR analysis also work well and demonstrate high diagnostic accuracy metrics in real clinical practice?…”

Section: Introductionmentioning

confidence: 94%

“…The diagnostic accuracy of the algorithms provided by the developers is quite high [ 7 , 8 , 9 ], reaching the same accuracy for radiologists [ 10 ], and for some solutions even exceeding them [ 11 , 12 ]. As of the beginning of 2023, 29 AI-based software products have European certification for medical use as a medical device (CE MDR/MDD), of which 11 have passed a similar certification in the United States [ 13 ].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

AI-Based CXR First Reading: Current Limitations to Ensure Practical Value

Vasilev¹,

Vladzymyrskyy

Omelyanskaya³

et al. 2023

Diagnostics

View full text Add to dashboard Cite

We performed a multicenter external evaluation of the practical and clinical efficacy of a commercial AI algorithm for chest X-ray (CXR) analysis (Lunit INSIGHT CXR). A retrospective evaluation was performed with a multi-reader study. For a prospective evaluation, the AI model was run on CXR studies; the results were compared to the reports of 226 radiologists. In the multi-reader study, the area under the curve (AUC), sensitivity, and specificity of the AI were 0.94 (CI95%: 0.87–1.0), 0.9 (CI95%: 0.79–1.0), and 0.89 (CI95%: 0.79–0.98); the AUC, sensitivity, and specificity of the radiologists were 0.97 (CI95%: 0.94–1.0), 0.9 (CI95%: 0.79–1.0), and 0.95 (CI95%: 0.89–1.0). In most regions of the ROC curve, the AI performed a little worse or at the same level as an average human reader. The McNemar test showed no statistically significant differences between AI and radiologists. In the prospective study with 4752 cases, the AUC, sensitivity, and specificity of the AI were 0.84 (CI95%: 0.82–0.86), 0.77 (CI95%: 0.73–0.80), and 0.81 (CI95%: 0.80–0.82). Lower accuracy values obtained during the prospective validation were mainly associated with false-positive findings considered by experts to be clinically insignificant and the false-negative omission of human-reported “opacity”, “nodule”, and calcification. In a large-scale prospective validation of the commercial AI algorithm in clinical practice, lower sensitivity and specificity values were obtained compared to the prior retrospective evaluation of the data of the same population.

show abstract

Section: Introductionmentioning

confidence: 94%

Section: Introductionmentioning

confidence: 99%

AI-Based CXR First Reading: Current Limitations to Ensure Practical Value

Vasilev¹,

Vladzymyrskyy

Omelyanskaya³

et al. 2023

Diagnostics

View full text Add to dashboard Cite

show abstract

“…AI has a tremendous potential to revolutionize health care and make it more efficient by improving diagnostics, detecting medical errors, and reducing the burden of paperwork ( 3 , 4 ); however, chances are it will never replace physicians. Algorithms perform relatively well on knowledge-based tests despite the lack of domain-specific training; ChatGPT achieved ~ 66% and ~ 72% on Basic Life Support and Advanced Cardiovascular Life Support tests, respectively ( 5 ), and performed at or near the passing threshold on the United States Medical Licensing Exam ( 6 , 7 ).…”

Section: Can Chatgpt Replace Physicians?mentioning

confidence: 99%

Opportunities and risks of ChatGPT in medicine, science, and academic publishing: a modern Promethean dilemma

Homolak

2023

Croat Med J

143

View full text Add to dashboard Cite

“…For artificial-intelligence-based computer-aided detection (AI-CAD) tools, the primary aim is to enhance the detection performance of interpreting radiologists or physicians [ 2 , 4 , 5 , 6 , 7 ]. Therefore, in addition to intrinsic performance, the method of delivering the results of the analysis to physicians is the key component of an AI-CAD tool to demonstrate its efficacy and value in clinical practice.…”

Section: Introductionmentioning

confidence: 99%

Methods of Visualizing the Results of an Artificial-Intelligence-Based Computer-Aided Detection System for Chest Radiographs: Effect on the Diagnostic Performance of Radiologists

et al. 2023

View full text Add to dashboard Cite

It is unclear whether the visualization methods for artificial-intelligence-based computer-aided detection (AI-CAD) of chest radiographs influence the accuracy of readers’ interpretation. We aimed to evaluate the accuracy of radiologists’ interpretations of chest radiographs using different visualization methods for the same AI-CAD. Initial chest radiographs of patients with acute respiratory symptoms were retrospectively collected. A commercialized AI-CAD using three different methods of visualizing was applied: (a) closed-line method, (b) heat map method, and (c) combined method. A reader test was conducted with five trainee radiologists over three interpretation sessions. In each session, the chest radiographs were interpreted using AI-CAD with one of the three visualization methods in random order. Examination-level sensitivity and accuracy, and lesion-level detection rates for clinically significant abnormalities were evaluated for the three visualization methods. The sensitivity (p = 0.007) and accuracy (p = 0.037) of the combined method are significantly higher than that of the closed-line method. Detection rates using the heat map method (p = 0.043) and the combined method (p = 0.004) are significantly higher than those using the closed-line method. The methods for visualizing AI-CAD results for chest radiographs influenced the performance of radiologists’ interpretations. Combining the closed-line and heat map methods for visualizing AI-CAD results led to the highest sensitivity and accuracy of radiologists.

show abstract

Association of Artificial Intelligence–Aided Chest Radiograph Interpretation With Reader Performance and Efficiency

Cited by 58 publications

References 41 publications

AI-Based CXR First Reading: Current Limitations to Ensure Practical Value

AI-Based CXR First Reading: Current Limitations to Ensure Practical Value

Opportunities and risks of ChatGPT in medicine, science, and academic publishing: a modern Promethean dilemma

Methods of Visualizing the Results of an Artificial-Intelligence-Based Computer-Aided Detection System for Chest Radiographs: Effect on the Diagnostic Performance of Radiologists

Contact Info

Product

Resources

About