PURPOSE
Determine if quantitative analyses (“radiomics”) of low dose CT lung cancer screening images at baseline can predict subsequent emergence of cancer.
PATIENTS AND METHODS
Public data from the National Lung Screening Trial (ACRIN 6684) were assembled into two cohorts of 104 and 92 patients with screen detected lung cancer (SDLC), then matched to cohorts of 208 and 196 screening subjects with benign pulmonary nodules (bPN). Image features were extracted from each nodule and used to predict the subsequent emergence of cancer.
RESULTS
The best models used 23 stable features in a Random Forest classifier, and could predict nodules that will become cancerous 1 and 2 years hence with accuracies of 80% (AUC 0.83) and 79% (AUC 0.75), respectively. Radiomics outperformed Lung-RADS and volume. McWilliams’ risk assessment model was commensurate.
CONCLUSION
Radiomics of lung cancer screening CTs at baseline can be used to assess risk for development of cancer.
Nonsmall cell lung cancer is a prevalent disease. It is diagnosed and treated with the help of computed tomography (CT) scans. In this paper, we apply radiomics to select 3-D features from CT images of the lung toward providing prognostic information. Focusing on cases of the adenocarcinoma nonsmall cell lung cancer tumor subtype from a larger data set, we show that classifiers can be built to predict survival time. This is the first known result to make such predictions from CT scans of lung cancer. We compare classifiers and feature selection approaches. The best accuracy when predicting survival was 77.5% using a decision tree in a leave-one-out cross validation and was obtained after selecting five features per fold from 219.
INDEX TERMSComputed tomography, CT 3D texture features, support vector machine, Naive Bayes, decision tree.
Lung cancer is the most common cause of cancer-related deaths in the USA. It can be detected and diagnosed using computed tomography images. For an automated classifier, identifying predictive features from medical images is a key concern. Deep feature extraction using pretrained convolutional neural networks (CNNs) has recently been successfully applied in some image domains. Here, we applied a pretrained CNN to extract deep features from 40 computed tomography images, with contrast, of non-small cell adenocarcinoma lung cancer, and combined deep features with traditional image features and trained classifiers to predict short- and long-term survivors. We experimented with several pretrained CNNs and several feature selection strategies. The best previously reported accuracy when using traditional quantitative features was 77.5% (area under the curve [AUC], 0.712), which was achieved by a decision tree classifier. The best reported accuracy from transfer learning and deep features was 77.5% (AUC, 0.713) using a decision tree classifier. When extracted deep neural network features were combined with traditional quantitative features, we obtained an accuracy of 90% (AUC, 0.935) with the 5 best post-rectified linear unit features extracted from a vgg-f pretrained CNN and the 5 best traditional features. The best results were achieved with the symmetric uncertainty feature ranking algorithm followed by a random forests classifier.
Lung cancer has a high incidence and mortality rate. Early detection and diagnosis of lung cancers is best achieved with low-dose computed tomography (CT). Classical radiomics features extracted from lung CT images have been shown as able to predict cancer incidence and prognosis. With the advancement of deep learning and convolutional neural networks (CNNs), deep features can be identified to analyze lung CTs for prognosis prediction and diagnosis. Due to a limited number of available images in the medical field, the transfer learning concept can be helpful. Using subsets of participants from the National Lung Screening Trial (NLST), we utilized a transfer learning approach to differentiate lung cancer nodules versus positive controls. We experimented with three different pretrained CNNs for extracting deep features and used five different classifiers. Experiments were also conducted with deep features from different color channels of a pretrained CNN. Selected deep features were combined with radiomics features. A CNN was designed and trained. Combinations of features from pretrained, CNNs trained on NLST data, and classical radiomics were used to build classifiers. The best accuracy (76.79%) was obtained using feature combinations. An area under the receiver operating characteristic curve of 0.87 was obtained using a CNN trained on an augmented NLST data cohort.
Features derived on primary lung tumor described by semantic and radiomic could provide information of pathological nodal involvement in clinical N0 peripheral lung adenocarcinomas.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.