2018
DOI: 10.3389/fnins.2018.00608
|View full text |Cite
|
Sign up to set email alerts
|

Deep Supervised Learning Using Local Errors

Abstract: Error backpropagation is a highly effective mechanism for learning high-quality hierarchical features in deep networks. Updating the features or weights in one layer, however, requires waiting for the propagation of error signals from higher layers. Learning using delayed and non-local errors makes it hard to reconcile backpropagation with the learning mechanisms observed in biological neural networks as it requires the neurons to maintain a memory of the input long enough until the higher-layer errors arrive.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
91
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 96 publications
(92 citation statements)
references
References 60 publications
1
91
0
Order By: Relevance
“…Backpropagation is biologically unrealistic for several reasons such as the need to interleave forward and backward passes, and the use of symmetric weights in the forward and backward passes. More biologically-plausible models have been proposed to address these issues that use contrastive learning in energy-based models Xie & Seung (2003); Bengio & Fischer (2015); Scellier & Bengio (2017), or that relax the symmetry requirement by using random weights in the backward pass Lillicrap et al (2016); Baldi et al (2016); Nøkland (2016); Mostafa et al (2017). These methods, however, have been applied in supervised learning settings and their performance and applicability to learning in stochastic networks is unclear.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Backpropagation is biologically unrealistic for several reasons such as the need to interleave forward and backward passes, and the use of symmetric weights in the forward and backward passes. More biologically-plausible models have been proposed to address these issues that use contrastive learning in energy-based models Xie & Seung (2003); Bengio & Fischer (2015); Scellier & Bengio (2017), or that relax the symmetry requirement by using random weights in the backward pass Lillicrap et al (2016); Baldi et al (2016); Nøkland (2016); Mostafa et al (2017). These methods, however, have been applied in supervised learning settings and their performance and applicability to learning in stochastic networks is unclear.…”
Section: Discussionmentioning
confidence: 99%
“…Further work is needed to develop a more biologicallymotivated learning method, in the spirit of the learning method in ref. Mostafa et al (2017), that learns online and changes synaptic weights based only on information in the pre-and post-synaptic neurons.…”
Section: Introductionmentioning
confidence: 99%
“…Consistent with the our results, it was repored the high classification performance using GoogLeNet model pre-trained on Image Net as a feature extractor (Zhu et al, 2019). The deep neural networks (DNNs) are trained using the optimized SGD algorithm, which calculates a expected error gradient for the current model state by the training datasets, corrects the weights of a node in the network each time by backpropagation, where the amount of weight updated during the training is a configurable hyperparameter and called the LR (Mostafa et al, 2018;Zhao et al, 2019). The performance of the SGD depended on how LRs, which controls the rate or speed at the end of each batch of trainings are turned over time (Zhao et al, 2019).…”
Section: Contributions Of Parameters For Prediction Performance In Thmentioning
confidence: 94%
“…Gradient BP can solve this, but is not compatible with a physical implementation of the neural network [6]. Several approximations have emerged recently to solve this, such as feedback alignment [7]- [9], and local losses defined for each layer [10]- [12]. For classification, local losses can be local classifiers (using output labels) [10], and supervised clustering, which perform on par and sometimes better than BP in classical ML benchmark tasks [12].…”
Section: B Local Losses and Local Errorsmentioning
confidence: 99%
“…Several approximations have emerged recently to solve this, such as feedback alignment [7]- [9], and local losses defined for each layer [10]- [12]. For classification, local losses can be local classifiers (using output labels) [10], and supervised clustering, which perform on par and sometimes better than BP in classical ML benchmark tasks [12]. For all experiments used in this work, we use a layerwise local classifier using a mean-squared error loss defined as…”
Section: B Local Losses and Local Errorsmentioning
confidence: 99%