A small sample size and unbalanced sample distribution are two main problems when data-driven methods are applied for fault diagnosis in practical engineering. Technically, sample generation and data augmentation have proven to be effective methods to solve this problem. The generative adversarial network (GAN) has been widely used in recent years as a representative generative model. Besides the general GAN, many variants have recently been reported to address its inherent problems such as mode collapse and slow convergence. In addition, many new techniques are being proposed to increase the sample generation quality. Therefore, a systematic review of GAN, especially its application in fault diagnosis, is necessary. In this paper, the theory and structure of GAN and variants such as ACGAN, VAEGAN, DCGAN, WGAN, et al. are presented first. Then, the literature on GANs is mainly categorized and analyzed from two aspects: improvements in GAN’s structure and loss function. Specifically, the improvements in the structure are classified into three types: information-based, input-based, and layer-based. Regarding the modification of the loss function, it is sorted into two aspects: metric-based and regularization-based. Afterwards, the evaluation metrics of the generated samples are summarized and compared. Finally, the typical applications of GAN in the bearing fault diagnosis field are listed, and the challenges for further research are also discussed.
Convolutional Neural Network (CNN) has been widely used in bearing fault diagnosis in recent years, and many satisfying results have been reported. However, when the training dataset provided is unbalanced, such as the samples in some fault labels are very limited, the CNN’s performance reduces inevitably. To solve the dataset imbalance problem, a Generative Adversarial Network (GAN) has been preferably adopted for the data generation. In published research studies, GAN only focuses on the overall similarity of generated data to the original measurement. The similarity in the fault characteristics is ignored, which carries more information for the fault diagnosis. To bridge this gap, this paper proposes two modifications for the general GAN. Firstly, a CNN, together with a GAN, and two networks are optimized collaboratively. The GAN provides a more balanced dataset for the CNN, and the CNN outputs the fault diagnosis result as a correction term in the GAN generator’s loss function to improve the GAN’s performance. Secondly, the similarity of the envelope spectrum between the generated data and the original measurement is considered. The envelope spectrum error from the 1st to 5th order of the Fault Characteristic Frequencies (FCF) is taken as another correction in the GAN generator’s loss function. Experimental results show that the bearing fault samples generated by the optimized GAN contain more fault information than the samples produced by the general GAN. Furthermore, after the data augmentation for the unbalanced training sets, the CNN’s accuracy in the fault classification has been significantly improved.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.