Abstract:An optimum weight initialization which strongly improves the performance of the back propagation (BP) algorithm is suggested. By statistical analysis, the scale factor, R (which is proportional to the maximum magnitude of the weights), is obtained as a function of the paralyzed neuron percentage (PNP). Also, by computer simulation, the performances on the convergence speed have been related to PNP. An optimum range for R is shown to exist in order to minimize the time needed to reach the minimum of the cost fu… Show more
“…They concluded that the best initial weight variance is determined by the dataset, but differences for small deviations are not significant and weights in the range [−0.77, 0.77] seem to give the best mean performance. Fernández-Redondo & Hernández-Espinosa (2001) presented an extensive experimental comparison of seven weight initialization methods; those reported by Kim & Ra (1991); Li et al (1993); Palubinskas (1994); Shimodaira (1994) ;Yoon et al (1995); Drago & Ridella (1992). Researchers claim that methods described in Palubinskas (1994); Shimodaira (1994) above proved to give the better results from all methods tested.…”
Section: Random Selection Of Initial Weightsmentioning
confidence: 96%
“…Drago & Ridella (1992) proposed a method aiming to avoid flat regions in the error surface in an early stage of training. Their method is called statistically controlled activation weight initialization (SCAWI).…”
Section: Random Selection Of Initial Weightsmentioning
Determining good initial conditions for an algorithm used to train a neural network is considered a parameter estimation problem dealing with uncertainty about the initial weights. Interval Analysis approaches model uncertainty in parameter estimation problems using intervals and formulating tolerance problems. Solving a tolerance problem is defining lower and upper bounds of the intervals so that the system functionality is guaranteed within predefined limits. The aim of this paper is to show how the problem of determining the initial weight intervals of a neural network can be defined in terms of solving a linear interval tolerance problem. The proposed Linear Interval Tolerance Approach copes with uncertainty about the initial weights without any previous knowledge or specific assumptions on the input data as required by approaches such as fuzzy sets or rough sets. The proposed method is tested on a number of well known benchmarks for neural networks trained with the back-propagation family of algorithms. Its efficiency is evaluated with regards to standard performance measures and the results obtained are compared against results of a number of well known and established initialization methods. These results provide credible evidence that the proposed method outperforms classical weight initialization methods.
“…They concluded that the best initial weight variance is determined by the dataset, but differences for small deviations are not significant and weights in the range [−0.77, 0.77] seem to give the best mean performance. Fernández-Redondo & Hernández-Espinosa (2001) presented an extensive experimental comparison of seven weight initialization methods; those reported by Kim & Ra (1991); Li et al (1993); Palubinskas (1994); Shimodaira (1994) ;Yoon et al (1995); Drago & Ridella (1992). Researchers claim that methods described in Palubinskas (1994); Shimodaira (1994) above proved to give the better results from all methods tested.…”
Section: Random Selection Of Initial Weightsmentioning
confidence: 96%
“…Drago & Ridella (1992) proposed a method aiming to avoid flat regions in the error surface in an early stage of training. Their method is called statistically controlled activation weight initialization (SCAWI).…”
Section: Random Selection Of Initial Weightsmentioning
Determining good initial conditions for an algorithm used to train a neural network is considered a parameter estimation problem dealing with uncertainty about the initial weights. Interval Analysis approaches model uncertainty in parameter estimation problems using intervals and formulating tolerance problems. Solving a tolerance problem is defining lower and upper bounds of the intervals so that the system functionality is guaranteed within predefined limits. The aim of this paper is to show how the problem of determining the initial weight intervals of a neural network can be defined in terms of solving a linear interval tolerance problem. The proposed Linear Interval Tolerance Approach copes with uncertainty about the initial weights without any previous knowledge or specific assumptions on the input data as required by approaches such as fuzzy sets or rough sets. The proposed method is tested on a number of well known benchmarks for neural networks trained with the back-propagation family of algorithms. Its efficiency is evaluated with regards to standard performance measures and the results obtained are compared against results of a number of well known and established initialization methods. These results provide credible evidence that the proposed method outperforms classical weight initialization methods.
“…Efficient weight initialization is one of the most important factors for fast convergence and generalization, and many authors have proposed various weight initialization methods. The simplest and most widely used weight initialization method is a random initialization assuming some probability distributions and some researchers proposed several modified methods to determine the best initialization interval [1][2]. Another initialization approach is to incorporate the known prior knowledge into weight initialization [3].…”
Section: Introductionmentioning
confidence: 99%
“…If we assume that there are K neurons in the hidden layer, the weight matrices 1 W and 2 W for the two-pattern class neural network can be represented by where L=((M+1)K+2(K+1)). Then, we may view the weight vector W as a weight point in an L-dimensional weight space that is defined by all the weights in the neural network.…”
Abstract. In this paper, we investigate and analyze the weight distribution of feedforward two-layer neural networks in order to understand and improve the time-consuming training process of neural networks. Generally, it takes a long time to train neural networks. However, when a new problem is presented, neural networks have to be trained again without any benefit from previous training. In order to address this problem, we view training process as finding a solution weight point in a weight space and analyze the distribution of solution weight points in the weight space. Then, we propose a weight initialization method that uses the information on the distribution of the solution weight points. Experimental results show that the proposed weight initialization method provides a better performance than the conventional method that uses a random generator in terms of convergence speed.
SUMMARYThis paper proposes new neural network architecture for nonlinear system modeling. The traditional modeling methods with neural network have the following problems: (1) difficulty in analyzing the internal representation, namely, the obtained values of the coupling weights, (2) no reproducibility due to the random scheme for weight initialization, (3) insufficient generalization ability for the input space in which no training sample exists. In order to overcome these deficiencies, the proposed method presents the following approaches. The first is the design of a sigmoid function with localized derivative. The second is a deterministic scheme for weight initialization. The third is an updating rule for weight parameters. Simulations were conducted based on several nonlinear systems with two inputs and one output. These results indicated small initial error, small modeling error, smooth convergence, and improvement of the difficulty in analyzing the internal representation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.