Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting

Li, Xilai; Zhou, Yingbo; Wu, Tianfu; Socher, Richard; Xiong, Caiming

doi:10.48550/arxiv.1904.00310

Cited by 28 publications

(22 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A few works also look at continual learning from the perspectives of the loss landscape and dynamics of optimization [Mirzadeh et al, 2020, Mirzadeh et al, 2020b. Modularity-based methods allocate different subsets of the parameters to each task [Rusu et al, 2016, Yoon et al, 2018, Jerfel et al, 2019, Li et al, 2019, Wortsman et al, 2020, Mirzadeh et al, 2020a.…”

Section: Related Workmentioning

confidence: 99%

Task-agnostic Continual Learning with Hybrid Probabilistic Models

Kirichenko,

Farajtabar,

Rao

et al. 2021

Preprint

View full text Add to dashboard Cite

Learning new tasks continuously without forgetting on a constantly changing data distribution is essential for real-world problems but extremely challenging for modern deep learning. In this work we propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification. We model the distribution of each task and each class with a normalizing flow. The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting, all leveraging the invertibility and exact likelihood which are uniquely enabled by the normalizing flow model. We use the generative capabilities of the flow to avoid catastrophic forgetting through generative replay and a novel functional regularization technique. For task identification, we use state-of-the-art anomaly detection techniques based on measuring the typicality of the model's statistics. We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST. * Work partially done as an intern at DeepMind.

show abstract

Section: Related Workmentioning

confidence: 99%

Task-agnostic Continual Learning with Hybrid Probabilistic Models

Kirichenko,

Farajtabar,

Rao

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Microscopically, existing methods dynamically expand networks using thresholds on loss functions over new tasks and retrain the selected weights to prevent semantic drift [68]. Reinforced continual learning [65] employs a controller to define a strategy that expands the architecture of a given network while the learn-to-grow model [32] relies on neural architecture search [76] to define optimal architectures on new tasks. Other models [7], inspired by the process of adult neurogenesis in the hippocampus, combine architecture expansion with pseudo-rehearsal using auto-encoders.…”

Section: Related Workmentioning

confidence: 99%

FFNB: Forgetting-Free Neural Blocks for Deep Continual Visual Learning

Sahbi¹,

Zhan²

2021

Preprint

View full text Add to dashboard Cite

Deep neural networks (DNNs) have recently achieved a great success in computer vision and several related fields. Despite such progress, current neural architectures still suffer from catastrophic interference (a.k.a. forgetting) which obstructs DNNs to learn continually. While several state-of-the-art methods have been proposed to mitigate forgetting, these existing solutions are either highly rigid (as regularization) or time/memory demanding (as replay). An intermediate class of methods, based on dynamic networks, has been proposed in the literature and provides a reasonable balance between task memorization and computational footprint. In this paper, we devise a dynamic network architecture for continual learning based on a novel forgetting-free neural block (FFNB). Training FFNB features on new tasks is achieved using a novel procedure that constrains the underlying parameters in the null-space of the previous tasks, while training classifier parameters equates to Fisher discriminant analysis. The latter provides an effective incremental process which is also optimal from a Bayesian perspective. The trained features and classifiers are further enhanced using an incremental "end-to-end" fine-tuning. Extensive experiments, conducted on different challenging classification problems, show the high effectiveness of the proposed method.

show abstract

“…Experiments show that Firefly efficiently learns accurate and resource-efficient networks in various settings. In particular, for continual learning, our method learns more accurate and smaller networks that can better prevent catastrophic forgetting, outperforming state-of-the-art methods such as Learn-to-Grow (Li et al, 2019) and Compact-Pick-Grow (Hung et al, 2019a).…”

Section: Introductionmentioning

confidence: 98%

“…In addition, dynamically growing neural network has also been proposed as a promising approach for preventing the challenging catastrophic forgetting problem in continual learning (Rusu et al, 2016;Yoon et al, 2017;Rosenfeld & Tsotsos, 2018;Li et al, 2019).…”

Section: Introductionmentioning

confidence: 99%

Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks

Wu,

Liu,

Stone

et al. 2021

Preprint

View full text Add to dashboard Cite

We propose firefly neural architecture descent, a general framework for progressively and dynamically growing neural networks to jointly optimize the networks' parameters and architectures. Our method works in a steepest descent fashion, which iteratively finds the best network within a functional neighborhood of the original network that includes a diverse set of candidate network structures. By using Taylor approximation, the optimal network structure in the neighborhood can be found with a greedy selection procedure. We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures that avoid catastrophic forgetting in continual learning. Empirically, firefly descent achieves promising results on both neural architecture search and continual learning. In particular, on a challenging continual image classification task, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.

show abstract

Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting

Cited by 28 publications

References 20 publications

Task-agnostic Continual Learning with Hybrid Probabilistic Models

Task-agnostic Continual Learning with Hybrid Probabilistic Models

FFNB: Forgetting-Free Neural Blocks for Deep Continual Visual Learning

Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks

Contact Info

Product

Resources

About