Data imbalance issue generally exists in most medical image analysis problems and maybe getting important with the popularization of data-hungry deep learning paradigms. We explore the cuttingedge Wasserstein generative adversarial networks (WGANs) to address the data imbalance problem with oversampling on the minority classes. The WGAN can estimate the underlying distribution of a minority class to synthesize more plausible and helpful samples for the classification model. In this paper, the WGANbased over-sampling technique is applied to augment the data to balance for the fine-grained classification of seven semantic attributes of lung nodules in computed tomography images. The fine-grained classification is carried out with a normal convolutional neural network (CNN). To further illustrate the efficacy of the WGAN-based over-sampling technique, the conventional data augmentation method commonly used in many deep learning works, the generative adversarial networks (GANs), and the deep convolutional generative adversarial networks (DCGANs) are implemented for comparison. The whole schemes of the minority oversampling and fine-grained classification are tested with the public lung imaging database consortium dataset. The experimental results suggest that the WGAN-based oversampling technique can synthesize helpful samples for the minority classes to assist the training of the CNN model and to boost the fine-grained classification performance better than the conventional data augmentation method and the two schemes of the GAN and DCGAN techniques do. It may thus suggest that the WGAN technique offers an alternative methodological option for the further deep learning on imbalanced classification studies.INDEX TERMS Computer-aided diagnosis (CAD), lung nodule, computed tomography (CT), synthetic minority over-sampling, deep learning, data imbalance, adversarial neural networks.
Background
Pneumothorax (PTX) may cause a life-threatening medical emergency with cardio-respiratory collapse that requires immediate intervention and rapid treatment. The screening and diagnosis of pneumothorax usually rely on chest radiographs. However, the pneumothoraces in chest X-rays may be very subtle with highly variable in shape and overlapped with the ribs or clavicles, which are often difficult to identify. Our objective was to create a large chest X-ray dataset for pneumothorax with pixel-level annotation and to train an automatic segmentation and diagnosis framework to assist radiologists to identify pneumothorax accurately and timely.
Methods
In this study, an end-to-end deep learning framework is proposed for the segmentation and diagnosis of pneumothorax on chest X-rays, which incorporates a fully convolutional DenseNet (FC-DenseNet) with multi-scale module and spatial and channel squeezes and excitation (scSE) modules. To further improve the precision of boundary segmentation, we propose a spatial weighted cross-entropy loss function to penalize the target, background and contour pixels with different weights.
Results
This retrospective study are conducted on a total of eligible 11,051 front-view chest X-ray images (5566 cases of PTX and 5485 cases of Non-PTX). The experimental results show that the proposed algorithm outperforms the five state-of-the-art segmentation algorithms in terms of mean pixel-wise accuracy (MPA) with $$0.93\pm 0.13$$
0.93
±
0.13
and dice similarity coefficient (DSC) with $$0.92\pm 0.14$$
0.92
±
0.14
, and achieves competitive performance on diagnostic accuracy with 93.45% and $$F_1$$
F
1
-score with 92.97%.
Conclusion
This framework provides substantial improvements for the automatic segmentation and diagnosis of pneumothorax and is expected to become a clinical application tool to help radiologists to identify pneumothorax on chest X-rays.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.