Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning

Ostapenko, Oleksiy; Puscas, Mihai Marian; Klein, Tassilo; Jähnichen, Patrick; Nabi, Moin

doi:10.1109/cvpr.2019.01158

Cited by 235 publications

(178 citation statements)

References 24 publications

Supporting

Mentioning

178

Contrasting

Order By: Relevance

“…DGM was proposed in [ 96 ], which relies on conditional generative adversarial networks. It trained a sparse binary mask for each layer of the generator.…”

Section: Methods Descriptionmentioning

confidence: 99%

See 2 more Smart Citations

An Appraisal of Incremental Learning Methods

Luo

Yin

Bai

et al. 2020

Entropy

View full text Add to dashboard Cite

As a special case of machine learning, incremental learning can acquire useful knowledge from incoming data continuously while it does not need to access the original data. It is expected to have the ability of memorization and it is regarded as one of the ultimate goals of artificial intelligence technology. However, incremental learning remains a long term challenge. Modern deep neural network models achieve outstanding performance on stationary data distributions with batch training. This restriction leads to catastrophic forgetting for incremental learning scenarios since the distribution of incoming data is unknown and has a highly different probability from the old data. Therefore, a model must be both plastic to acquire new knowledge and stable to consolidate existing knowledge. This review aims to draw a systematic review of the state of the art of incremental learning methods. Published reports are selected from Web of Science, IEEEXplore, and DBLP databases up to May 2020. Each paper is reviewed according to the types: architectural strategy, regularization strategy and rehearsal and pseudo-rehearsal strategy. We compare and discuss different methods. Moreover, the development trend and research focus are given. It is concluded that incremental learning is still a hot research area and will be for a long period. More attention should be paid to the exploration of both biological systems and computational models.

show abstract

“…DGM was proposed in [ 96 ], which relies on conditional generative adversarial networks. It trained a sparse binary mask for each layer of the generator.…”

Section: Methods Descriptionmentioning

confidence: 99%

“…All images are gray level with size of 32 × 32 pixels. MNIST is adopted in [ 15 , 22 , 33 , 47 , 52 , 54 , 75 , 92 , 96 ].…”

Section: Datasetsmentioning

confidence: 99%

See 1 more Smart Citation

An Appraisal of Incremental Learning Methods

Luo

Yin

Bai

et al. 2020

Entropy

View full text Add to dashboard Cite

show abstract

“…Expanding the model capacities during training has been mainly discussed in the lifelong learning area, which tries to fit a model to the data sequence without catastrophic forgetting [9,17,22,27]. One of the common approaches is assigning an expanded capacity for the incoming data while limiting the parameters' change to the proximity of the trained parameters by the previous data [6,21,27].…”

Section: Model Expansionmentioning

confidence: 99%

Fewer

Shin

Lee

Shin

et al. 2020

Proceedings of the 1st Workshop on Distributed Machine Learning

View full text Add to dashboard Cite

In federated learning, the local devices train the model with their local data, independently; and the server gathers the locally trained model to aggregate them into a shared global model. Therefore, federated learning is an approach to decouple the model training from directly assessing the local data. However, the requirement of periodic communications on model parameters results in a primary bottleneck for the efficiency of federated learning. This work proposes a novel federated learning algorithm, Federated Weight Recovery(FEWER), which enables a sparsely pruned model in the training phase. FEWER starts with the initial model training with an extremely sparse state, and FEWER gradually grows the model capacity until the model reaches a dense model at the end of the training. The level of sparsity becomes the leverage to either increasing the accuracy or decreasing the communication cost, and this sparsification can be beneficial to practitioners. Our experimental results show that FEWER achieves higher test accuracies with less communication costs for most of the test cases.

show abstract

“…According to recent works [28,2,34] aligning domain shift through batch normalization (BN) [23] layers, although in online condition we never have access to full target data, we are inspired to smoothly adjust model statistics through online data stream of target videos. Besides the statistics, works [20,37,41,40] on multi-domain or incremental learning inspire us to selectively tune a small subset of learned basic parameters while keeping all the other fixed. In our setting this is meant to ensure that visual appearance variations arising from scene changes will never influence the networks weights encoding the reliable knowledge.…”

Section: Introductionmentioning

confidence: 99%

Online Depth Learning Against Forgetting in Monocular Videos

Zhang

Lathuilière

Ricci

et al. 2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Online depth learning is the problem of consistently adapting a depth estimation model to handle a continuously changing environment. This problem is challenging due to the network easily overfits on the current environment and forgets its past experiences. To address such problem, this paper presents a novel Learning to Prevent Forgetting (LPF) method for online mono-depth adaptation to new target domains in unsupervised manner. Instead of updating the universal parameters, LPF learns adapter modules to efficiently adjust the feature representation and distribution without losing the pre-learned knowledge in online condition. Specifically, to adapt temporal-continuous depth patterns in videos, we introduce a novel meta-learning approach to learn adapter modules by combining online adaptation process into the learning objective. To further avoid overfitting, we propose a novel temporal-consistent regularization to harmonize the gradient descent procedure at each online learning step. Extensive evaluations on realworld datasets demonstrate that the proposed method, with very limited parameters, significantly improves the estimation quality.

show abstract

Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning

Cited by 235 publications

References 24 publications

An Appraisal of Incremental Learning Methods

An Appraisal of Incremental Learning Methods

Fewer

Online Depth Learning Against Forgetting in Monocular Videos

Contact Info

Product

Resources

About