“…Deep neural networks, which combine feature extraction and classification, have achieved promising results in several application areas, such as computer vision, natural language processing, speech recognition, and text classification [ 25 ]. Deep learning models, such as convolutional neural networks (CNNs) [ 26 ], recurrent neural networks (RNNs) [ 27 ], and self-attention [ 28 ], have been employed and state-of-the-art results have been obtained, in order to improve the performance for fault diagnosis while using a PRPD in a GIS. [ 26 ] proposed a CNN to learn the local response from the temporal or spatial signals of a PRPD.…”