2021
DOI: 10.1109/tpami.2021.3057446
|View full text |Cite
|
Sign up to set email alerts
|

A continual learning survey: Defying forgetting in classification tasks

Abstract: Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retr… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
672
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 920 publications
(673 citation statements)
references
References 81 publications
(137 reference statements)
1
672
0
Order By: Relevance
“…We now evaluate B-CL by comparing it with both non-continual learning and continual learning baselines. We follow the standard CL evaluation method in (Lange et al, 2019). We first present B-CL a sequence of aspect sentiment classification (ASC) tasks for it to learn.…”
Section: Methodsmentioning
confidence: 99%
“…We now evaluate B-CL by comparing it with both non-continual learning and continual learning baselines. We follow the standard CL evaluation method in (Lange et al, 2019). We first present B-CL a sequence of aspect sentiment classification (ASC) tasks for it to learn.…”
Section: Methodsmentioning
confidence: 99%
“…Further, sequentially learning multiple tasks by finetuning a neural network results in significant loss of previously acquired knowledge. Literature on continual learning largely addresses coping with this catastrophic forgetting [5,25]. Nonetheless, recent works mainly focus on supervised data, leaving the richness of available unsupervised user data unused.…”
Section: Related Workmentioning
confidence: 99%
“…Nonetheless, recent works mainly focus on supervised data, leaving the richness of available unsupervised user data unused. Following [5], these methods can be subdivided into three main categories. First, parameter-isolation methods preserve task knowledge by obtaining task-specific masks [22,21,31], or dynamically extending the architecture [30].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, in a real-world application, customers may request deletion of their data, or the service itself may provide specific data retention policies, or the adopted model may be provided by a third party that did not release the training data (Chen and Moschitti, 2019). In these cases, new language support can be added in a Continual Learning (CL) setting (Lange et al, 2019), that is fine-tuning the model only using the annotated material for the new language(s). However, this approach is vulnerable to the Catastrophic Forgetting (CF) (McCloskey and Cohen, 1989) of previously learned languages, a well-documented concern discussed in Chen et al (2018): when a model is incrementally fine-tuned on new data distributions, it risks forgetting how to treat instances of the previously learned ones.…”
Section: Introductionmentioning
confidence: 99%