Background: Artificial intelligence (AI) products have been widely used for the clinical detection of primary lung tumors. However, their performance and accuracy in risk prediction for metastases or benign lesions remain underexplored. This study evaluated the accuracy of an AI-driven commercial computer-aided detection (CAD) product (InferRead CT Lung Research, ICLR) in malignancy risk prediction using a realworld database.Methods: This retrospective study assessed 486 consecutive resected lung lesions, including 320 adenocarcinomas, 40 other malignancies, 55 metastases, and 71 benign lesions, from September 2015 to November 2018. The malignancy risk probability of each lesion was obtained using the ICLR software based on a 3D convolutional neural network (CNN) with DenseNet architecture as a backbone (without clinical data). Two resident doctors independently graded each lesion using patient clinical history. One doctor (R1) has 3 years of chest radiology experience, and the other doctor (R2) has 3 years of general radiology experience. Cochran's Q test was used to assess the performances of the AI compared to the radiologists.
Results:The accuracy of malignancy-risk prediction using the ICLR for adenocarcinomas, other malignancies, metastases, and benign lesions was 93.4% (299/320), 95.0% (38/40), 50.9% (28/55), and 40.8% (29/71), respectively. The accuracy was significantly higher in adenocarcinomas and other malignancies compared to metastases and benign lesions (all P<0.05). The overall accuracy of risk prediction for R1 was 93.6% (455/486) and 87.4% for R2 (425/486), both of which were higher than the 81.1% accuracy obtained with the ICLR (394/486) (R1 vs. ICLR: P<0.001; R2 vs. ICLR: P=0.001), especially in assessing the risk of metastases (P<0.05). R1 performed better than R2 at risk prediction (P=0.001).
Conclusions:The accuracy of the ICLR for risk prediction is very high for primary lung cancers but poor for metastases and benign lesions.