2021
DOI: 10.48550/arxiv.2106.05390
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Optimizing Reusable Knowledge for Continual Learning via Metalearning

Abstract: When learning tasks over time, artificial neural networks suffer from a problem known as Catastrophic Forgetting (CF). This happens when the weights of a network are overwritten during the training of a new task causing forgetting of old information. To address this issue, we propose MetA Reusable Knowledge or MARK, a new method that fosters weight reusability instead of overwriting when learning a new task. Specifically, MARK keeps a set of shared weights among tasks. We envision these shared weights as a com… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 18 publications
0
1
0
Order By: Relevance
“…The neuromodulatory neural network is meta-trained to activate only a subset of parameters of the prediction learning network for each task in order to mitigate catastrophic forgetting. Hurtado et al [167] proposed a method called Meta Reusable Knowledge (MARK), which has a single common knowledge base for all the learned tasks. For each new task, MARK uses meta learning to update the common knowledge base and uses a trainable mask to extract the relevant parameters from the knowledge base for the task.…”
Section: Fixed Network Architecturesmentioning
confidence: 99%
“…The neuromodulatory neural network is meta-trained to activate only a subset of parameters of the prediction learning network for each task in order to mitigate catastrophic forgetting. Hurtado et al [167] proposed a method called Meta Reusable Knowledge (MARK), which has a single common knowledge base for all the learned tasks. For each new task, MARK uses meta learning to update the common knowledge base and uses a trainable mask to extract the relevant parameters from the knowledge base for the task.…”
Section: Fixed Network Architecturesmentioning
confidence: 99%