In this paper, analysis of a simple model of recurrent network dynamics is used to gain qualitative insights into the training dynamics of multilayer perceptrons (MLPs). These insights allow the training methods used for MLPs to be modified to significantly improve network performance. In previous work [1], the Probabihstic Neural Network (PNN) [2], was shown to provide better zero-reject error performance on character and fingerprint classification problems than Radial Basis Function and MLP-based neural network methods. We will show that performance equal to or better than PNN can be achieved with a single three-layer MLP by making fundamental changes in the network optimization strategy. These changes are: 1) Neuron activation functions are used which reduce the probabihty of singular Jacobians; 2) Successive regularization is used to constrain the volume of the minimized weight space; 3) Boltzmann pruning [3] is used to constrain the dimension of the weight space; and 4) Prior class probabilities are used to normalize all error calculations so that statistically significant samples of rare but important classes can be included without distorting the error surface. All four of these changes are made in the inner loop of a conjugate gradient optimization iteration [4] and are intended to simphfy the training dynamics of the optimization. On handprinted digits and fingerprint classification problems these modifications improve errorreject performance by factors between 2 and 4 and reduce network size by 40% to 60%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.