Proper initialization is one of the most important prerequisites for fast convergence of feed-forward neural networks like high order and multilayer perceptrons. This publication aims at determining the optimal value of the initial weight v ariance (or range), which is the principal parameter of random weight initialization methods for both types of neural networks.An overview of random weight initialization methods for multilayer perceptrons is presented. These methods are extensively tested using eight real-world benchmark data sets and a broad range of initial weight v ariances by means of more than 30 000 simulations, in the aim to nd the best weight initialization method for multilayer perceptrons.For high order networks, a large number of experiments (more than 200 000 simulations) was performed, using three weight distributions, three activation functions, several network orders, and the same eight data sets. The results of these experiments are compared to weight initialization techniques for multilayer perceptrons, which leads to the proposal of a suitable weight initialization method for high order perceptrons.The conclusions on the weight initialization methods for both types of networks are justi ed by su ciently small con dence intervals of the mean convergence times.
Abstract. In order to take a d v antage of the massive parallelism o ered by arti cial neural networks, hardware implementations are essential. However, most standard neural network models are not very suitable for implementation in hardware and adaptations are needed. In this section an overview is given of the various issues that are encountered when mapping an ideal neural network model onto a compact and reliable neural network hardware implementation, like quantization, handling non-uniformities and non-ideal responses, and restraining computational complexity. F urthermore, a broad range of hardware-friendly learning rules is presented, which a l l o w for simpler and more reliable hardware implementations. The relevance of these neural network adaptations to hardware is illustrated by their application in existing hardware implementations.
The backpropagation algorithm is widely used for training multilayer neural networks. In this publication the gain of its activation function(s) is investigated. In speci c, it is proven that changing the gain of the activation function is equivalent t o c hanging the learning rate and the weights. This simpli es the backpropagation learning rule by eliminating one of its parameters. The theorem can be extended to hold for some well-known variations on the backpropagation algorithm, such as using a momentum term, at spot elimination, or adaptive gain. Furthermore, it is successfully applied to compensate for the non-standard gain of optical sigmoids for optical neural networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.