How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we present a derivation that unifies these approaches. We demonstrate that the approaches are only five different ways of solving a particular matrix equation. The second goal of this paper is develop a new algorithm based on the insights gained from the novel formulation. The new algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems. In addition, it reaches the error minimum in a much smaller number of iterations. A desirable characteristic of recurrent network training algorithms is to be able to update the weights in an on-line fashion. We have also developed an on-line version of the proposed algorithm, that is based on updating the error gradient approximation in a recursive manner.
A nonlinear dynamic model is developed for a process system, namely a heat exchanger, using the recurrent multilayer perceptron network as the underlying model structure. The perceptron is a dynamic neural network, which appears effective in the input-output modeling of complex process systems. Dynamic gradient descent learning is used to train the recurrent multilayer perceptron, resulting in an order of magnitude improvement in convergence speed over a static learning algorithm used to train the same network. In developing the empirical process model the effects of actuator, process, and sensor noise on the training and testing sets are investigated. Learning and prediction both appear very effective, despite the presence of training and testing set noise, respectively. The recurrent multilayer perceptron appears to learn the deterministic part of a stochastic training set, and it predicts approximately a moving average response of various testing sets. Extensive model validation studies with signals that are encountered in the operation of the process system modeled, that is steps and ramps, indicate that the empirical model can substantially generalize operational transients, including accurate prediction of instabilities not in the training set. However, the accuracy of the model beyond these operational transients has not been investigated. Furthermore, online learning is necessary during some transients and for tracking slowly varying process dynamics. Neural networks based empirical models in some cases appear to provide a serious alternative to first principles models.
Predicting traffic generated by multimedia sources is needed for effective dynamic bandwidth allocation and for multimedia quality-of-service (QoS) control strategies implemented at the network edges. The time-series representing frame or visual object plane (VOP) sizes of an MPEG-coded stream is extremely noisy, and it has very long-range time dependencies. This paper provides an approach for developing MPEG-coded real-time video traffic predictors for use in single-step (SS) and multistep (MS) prediction horizons. The designed SS predictor consists of one recurrent network for-VOPs and two feedforward networks forand-VOPs, respectively. These are used for single-frame-ahead prediction. A moving average of the frame or VOP sizes time-series is generated from the individual frame sizes and used for both SS and MS prediction. The resulting MS predictor is based on recurrent networks, and it is used to perform two-step-ahead and four-step-ahead prediction, corresponding to multistep prediction horizons of 1 and 2 s, respectively. All of the predictors are designed using a segment of a single MPEG-4 video stream, and they are tested for accuracy on complete video streams with a variety of quantization levels, coded with both MPEG-1 and MPEG-4. Comparisons with SS prediction results of MPEG-1 coded video traces from the recent literature are presented. No similar results are available for prediction of MPEG-4 coded video traces and for MS prediction. These are considered unique contributions of this research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.