2020
DOI: 10.1109/tnnls.2019.2953622
|View full text |Cite
|
Sign up to set email alerts
|

Continual Learning of Recurrent Neural Networks by Locally Aligning Distributed Representations

Abstract: Temporal models based on recurrent neural networks have proven to be quite powerful in a wide variety of applications, including language modeling and speech processing. However, training these models often relies on back-propagation through time, which entails unfolding the network over many time steps, making the process of conducting credit assignment considerably more challenging. Furthermore, the nature of backpropagation itself does not permit the use of non-differentiable activation functions and is inh… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
70
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 44 publications
(71 citation statements)
references
References 66 publications
0
70
1
Order By: Relevance
“…Neural generative coding (NGC) is a recently developed framework [35] that generalizes classical ideas in predictive processing [40,39] to the construction of scalable neural models that model and predict both static and temporal patterns [33,35]. An NGC model is composed of L layers of stateful neurons, where the activity values of each layer = {0, 1,…”
Section: The Neural Generative Coding Circuitmentioning
confidence: 99%
See 2 more Smart Citations
“…Neural generative coding (NGC) is a recently developed framework [35] that generalizes classical ideas in predictive processing [40,39] to the construction of scalable neural models that model and predict both static and temporal patterns [33,35]. An NGC model is composed of L layers of stateful neurons, where the activity values of each layer = {0, 1,…”
Section: The Neural Generative Coding Circuitmentioning
confidence: 99%
“…The online objective that an NGC model attempts to minimize is known as total discrepancy [44], from which the error neuron, state update expressions, and local synaptic adjustments may be derived [41,35]. The total discrepancy objective, which could also be interpreted as a form of free energy [45] specialized for the case of stateful neural models that utilize arbitrary forward and error synaptic wiring pathways [41], can be expressed in many forms including the linear combination of local density functions [35] or the summation of local distance functions [33]. For this study, the form of total discrepancy we used to derive the expressions above is the linear combination of distance functions:…”
Section: The Neural Generative Coding Circuitmentioning
confidence: 99%
See 1 more Smart Citation
“…NotMNIST 5 is a more difficult variation of MNIST created by replacing the digits with characters of varying fonts/glyphs (letters A-J). The preprocessing and training, validation, and testing splits were created to match the setup of MNIST with the exception that there is more data in NotMNIST, i.e., 100, 000 data points are in the training split we used for our experiments (adapted from the smaller variant in [28]).…”
Section: Mnistmentioning
confidence: 99%
“…For example, the ability of neural networks to predict wind speed 6 or neuro‐FS to predict the magnitude of the next large earthquakes could be considered 7 . There are so many types of SC models that introduced, improved, or evaluated in the literature 8–12 to resolve the complex problems in different fields such as nonlinear delayed systems, 13 transcoding simultaneously acquired MRI data, 14 power system restoration, 15 eye surgery, 16 broiler output energies, 17 intelligent human action recognition, 18 behavior and environmental impacts of waste glass geopolymers, 19 and also for concrete elements 20–23 …”
Section: Introductionmentioning
confidence: 99%