Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-short.66
|View full text |Cite
|
Sign up to set email alerts
|

Continual Learning for Task-oriented Dialogue System with Iterative Network Pruning, Expanding and Masking

Abstract: This ability to learn consecutive tasks without forgetting how to perform previously trained problems is essential for developing an online dialogue system. This paper proposes an effective continual learning for the task-oriented dialogue system with iterative network pruning, expanding and masking (TPEM), which preserves performance on previously encountered tasks while accelerating learning progress on subsequent tasks. Specifically, TPEM (i) leverages network pruning to keep the knowledge for old tasks, (i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 22 publications
0
7
0
Order By: Relevance
“…Previous setups in continual learning for task-oriented dialogue have focused on learning a set of domains in a sequential order, where each domain is only observed for a certain period of time [7]- [9], [46]. The goal then is to maximise performance on all seen domains.…”
Section: Related Work a Continual Learning Setup In Dialoguementioning
confidence: 99%
“…Previous setups in continual learning for task-oriented dialogue have focused on learning a set of domains in a sequential order, where each domain is only observed for a certain period of time [7]- [9], [46]. The goal then is to maximise performance on all seen domains.…”
Section: Related Work a Continual Learning Setup In Dialoguementioning
confidence: 99%
“…Introducing task-specific parameters. Research on continual neural pruning assigns some model capacity to each task by iteratively pruning and retraining (sub-)networks that are specialized to each task (Mallya and Lazebnik, 2018;Geng et al, 2021;Dekhovich et al, 2023;Kang et al, 2022;Hung et al, 2019;Gurbuz and Dovrolis, 2022;Jung et al, 2020;Wang et al, 2022). However, while such methods are effective in overcoming forgetting, the evolution and learning behavior of pruned subnetworks provides little interpretability regarding the role of individual parts of the network in solving CL problems.…”
Section: Related Workmentioning
confidence: 99%
“…Previous architectural methods include dynamic expanding network structure (Rusu et al, 2016), iterative network pruning and re-training , learning a parameter mask for each task individually , etc. For continual learning in dialog system, variants of general CL methods have been applied (Lee, 2017;Shen et al, 2019;Mi et al, 2020;Geng et al, 2021). AdapterCL is the most related to our work, which freezes the pre-trained model and learns an adapter (Houlsby et al, 2019) for each task independently.…”
Section: Continual Learningmentioning
confidence: 99%
“…Simply storing a model version for each task to mitigate forgetting is prohibitive as the number of tasks grows, especially when the model size is large. To mitigate catastrophic forgetting with low computation and storage overhead, recent methods freeze the backbone model and propose to train a weight/feature mask Geng et al, 2021) or an adapter for each task independently. However, the techniques above are still not efficient enough, and they largely ignore knowledge transfer among tasks.…”
Section: Introductionmentioning
confidence: 99%