Traditional learning algorithms applied to complex and highly imbalanced training sets may not give satisfactory results when distinguishing between examples of the classes. The tendency is to yield classification models that are biased towards the overrepresented (majority) class. This paper investigates this class imbalance problem in the context of multilayer perceptron (MLP) neural networks. The consequences of the equal cost (loss) assumption on imbalanced data are formally discussed from a statistical learning theory point of view. A new cost-sensitive algorithm (CSMLP) is presented to improve the discrimination ability of (two-class) MLPs. The CSMLP formulation is based on a joint objective function that uses a single cost parameter to distinguish the importance of class errors. The learning rule extends the Levenberg-Marquadt's rule, ensuring the computational efficiency of the algorithm. In addition, it is theoretically demonstrated that the incorporation of prior information via the cost parameter may lead to balanced decision boundaries in the feature space. Based on the statistical analysis of results on real data, our approach shows a significant improvement of the area under the receiver operating characteristic curve and G-mean measures of regular MLPs.
This paper presents a comparison of three feature extraction methods to denoise partial discharge (PD) signals. The denoising technique employs the Stationary Wavelet Transform (SWT) associated to a spatially-adaptive selection procedure based on the coefficients propagation along decomposition levels (scales). The PD and noise related coefficients are identified and separated by an automatic data classifier using Support Vector Machines (SVM). The first and second feature extraction methods act directly on the SWT coefficients and differ only on the procedures to characterize the propagation. The third method relies on Cycle Spinning (CS) on the several translated Discrete Wavelet Transform (DWT) obtained from SWT. We conducted an empirical study using Analysis of Variance (ANOVA) to evaluate the influence of the methods on denoising performance and to guarantee the statistical significance of the tests. Afterwards, performance was evaluated considering real PD signals measured in air and in solid dielectrics, corrupted by several types of interferences, both stationary and time-varying. The results show that the three approaches allow robust signal recovering and significant noise rejection, but differ substantially on the quality of the reconstructed signals.
A new learning method for classification problems that is suitable for integrated circuit implementation is presented. The method, which outperforms current approaches in many data sets, is based on a structural description of the learning set represented by a planar graph. The final classification function is composed of a hierarchical mixture of local experts, which yields a large margin classifier for the whole learning set. Since it is based only on distance calculations, on-chip learning can also be executed. The method is also appropriate for online and incremental learning, since model parameters are obtained directly from the data set, without need of user interaction for learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.