Learning Long-Term Dependencies in NARX Recurrent Neural Networks

Lin, Tsung-Nan; Horne, Bill G.; Tiňo, Peter; Giles, C. Lee

doi:10.1201/9781420049176.ch6

Cited by 105 publications

(143 citation statements)

References 12 publications

Supporting

Mentioning

139

Contrasting

Unclassified

Order By: Relevance

“…To deal with long time lags between relevant events, several sequence processing methods were proposed, including Focused BP based on decay factors for activations of units in RNNs (Mozer, 1989(Mozer, , 1992, Time-Delay Neural Networks (TDNNs) (Lang et al, 1990) and their adaptive extension (Bodenhausen and Waibel, 1991), Nonlinear AutoRegressive with eXogenous inputs (NARX) RNNs (Lin et al, 1996), certain hierarchical RNNs (Hihi and Bengio, 1996) (compare Sec. 5.10, 1991), RL economies in RNNs with WTA units and local learning rules (Schmidhuber, 1989b), and other methods (e.g., Ring, 1993Ring, , 1994Plate, 1993;de Vries and Principe, 1991;Sun et al, 1993a;Bengio et al, 1994).…”

Section: Ideas For Dealing With Long Time Lags and Deep Capsmentioning

confidence: 99%

Deep learning in neural networks: An overview

2015

View full text Add to dashboard Cite

In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.LATEX source: http://www.idsia.ch/˜juergen/DeepLearning8Oct2014.tex Complete BIBTEX file (888 kB): http://www.idsia.ch/˜juergen/deep.bib Preface This is the preprint of an invited Deep Learning (DL) overview. One of its goals is to assign credit to those who contributed to the present state of the art. I acknowledge the limitations of attempting to achieve this goal. The DL research community itself may be viewed as a continually evolving, deep network of scientists who have influenced each other in complex ways. Starting from recent DL results, I tried to trace back the origins of relevant ideas through the past half century and beyond, sometimes using "local search" to follow citations of citations backwards in time. Since not all DL publications properly acknowledge earlier relevant work, additional global search strategies were employed, aided by consulting numerous neural network experts. As a result, the present preprint mostly consists of references. Nevertheless, through an expert selection bias I may have missed important work. A related bias was surely introduced by my special familiarity with the work of my own DL research group in the past quarter-century. For these reasons, this work should be viewed as merely a snapshot of an ongoing credit assignment process. To help improve it, please do not hesitate to send corrections and suggestions to juergen@idsia.ch.

show abstract

Section: Ideas For Dealing With Long Time Lags and Deep Capsmentioning

confidence: 99%

Deep learning in neural networks: An overview

2015

View full text Add to dashboard Cite

show abstract

“…The above scenario comes true for all recurrent structures. However, one can postpone vanishing of gradient in NARX recurrent neural networks with increasing the number of delays in the output delay line of this architecture [6]. As it may be seen in Fig.…”

Section: Narx Recurrent Neural Networkmentioning

confidence: 99%

“…if it is to store information for a long period of time in the presence of noise, then for a term with u << t , 0 ) ( / ) ( ® ¶ ¶ u y t y [5]. In this condition, the gradient decays exponentially [6], meaning that there is not any chance for the terms that are far from t to change the weights in such a way that allow the network's state to jump to a better basin of attraction. The above scenario comes true for all recurrent structures.…”

Section: Narx Recurrent Neural Networkmentioning

confidence: 99%

“…To tackle the problem of vanishing gradient, a class of recurrent neural networks, called nonlinear autoregressive model with exogenous inputs (NARX) is proposed [6], which has various advantages over simple recurrent networks. Not only has the NARX model less sensitivity to long-term dependencies [6], but also it has a very good learning capability and generalization performance [7].…”

Section: Introductionmentioning

confidence: 99%

“…Not only has the NARX model less sensitivity to long-term dependencies [6], but also it has a very good learning capability and generalization performance [7].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Forecasting the Unknown Dynamics in NN3 Database Using a Nonlinear Autoregressive Recurrent Neural Network

Safavieh¹,

Andalib²,

Andalib³

2007

2007 International Joint Conference on Neural Networks

View full text Add to dashboard Cite

Abstract-In this paper, a nonlinear autoregressive (NAR) recurrent neural network is used for the prediction of the next 18 data samples of each time series in a set of 11 unknown dynamics in NN3 Database. The models are built on the reconstructed state spaces of data and no other domain knowledge is available to be used. Here, we clarify that the employed method is in part similar to a superior subclass of recurrent neural network, namely the nonlinear autoregressive model with exogenous inputs (NARX). Using the extensive available research about NARX networks, we briefly explain that our model is preferred to the both non-recursive and even other recurrent predictors, because of its intrinsic ability for learning long term dependencies in time series. As the desired values of the predicted time series are not available yet, no analysis have been performed on the presented results.

show abstract

Hydropower Optimization Using Artificial Neural Network Surrogate Models of a High‐Fidelity Hydrodynamics and Water Quality Model

Shaw

Sawyer

LeBoeuf

et al. 2017

Water Resources Research

View full text Add to dashboard Cite

Hydropower operations optimization subject to environmental constraints is limited by challenges associated with dimensionality and spatial and temporal resolution. The need for high‐fidelity hydrodynamic and water quality models within optimization schemes is driven by improved computational capabilities, increased requirements to meet specific points of compliance with greater resolution, and the need to optimize operations of not just single reservoirs but systems of reservoirs. This study describes an important advancement for computing hourly power generation schemes for a hydropower reservoir using high‐fidelity models, surrogate modeling techniques, and optimization methods. The predictive power of the high‐fidelity hydrodynamic and water quality model CE‐QUAL‐W2 is successfully emulated by an artificial neural network, then integrated into a genetic algorithm optimization approach to maximize hydropower generation subject to constraints on dam operations and water quality. This methodology is applied to a multipurpose reservoir near Nashville, Tennessee, USA. The model successfully reproduced high‐fidelity reservoir information while enabling 6.8% and 6.6% increases in hydropower production value relative to actual operations for dissolved oxygen (DO) limits of 5 and 6 mg/L, respectively, while witnessing an expected decrease in power generation at more restrictive DO constraints. Exploration of simultaneous temperature and DO constraints revealed capability to address multiple water quality constraints at specified locations. The reduced computational requirements of the new modeling approach demonstrated an ability to provide decision support for reservoir operations scheduling while maintaining high‐fidelity hydrodynamic and water quality information as part of the optimization decision support routines.

show abstract

Learning Long-Term Dependencies in NARX Recurrent Neural Networks

Cited by 105 publications

References 12 publications

Deep learning in neural networks: An overview

Deep learning in neural networks: An overview

Forecasting the Unknown Dynamics in NN3 Database Using a Nonlinear Autoregressive Recurrent Neural Network

Hydropower Optimization Using Artificial Neural Network Surrogate Models of a High‐Fidelity Hydrodynamics and Water Quality Model

Contact Info

Product

Resources

About