2021
DOI: 10.48550/arxiv.2103.13678
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation

Abstract: Domain Adaptation is widely used in practical applications of neural machine translation, which aims to achieve good performance on both the general-domain and in-domain. However, the existing methods for domain adaptation usually suffer from catastrophic forgetting, domain divergence, and model explosion.To address these three problems, we propose a method of "divide and conquer" which is based on the importance of neurons or parameters in the translation model. In our method, we first prune the model and onl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 33 publications
0
1
0
Order By: Relevance
“…While some work has been done to interpret representations in the speech models [38,31,32,33,34,39], no prior work has been carried to do a neuronlevel analysis. Analyzing individual neurons facilitates a deeper understanding of the network [40,41,42,23,36,37,43] and entails many potential benefits such as manipulating system's output [44] while debasing the network w.r.t certain property (like gender or racial elements), model distillation and compression [45,46], domain adaptation [47], feature selection for downstream tasks [48] and guiding architectural search etc. To develop a better understanding of the learned representations, along with presence of bias and redundancy, we carry a layer and neuron-level analysis on the speech models.…”
Section: Introductionmentioning
confidence: 99%
“…While some work has been done to interpret representations in the speech models [38,31,32,33,34,39], no prior work has been carried to do a neuronlevel analysis. Analyzing individual neurons facilitates a deeper understanding of the network [40,41,42,23,36,37,43] and entails many potential benefits such as manipulating system's output [44] while debasing the network w.r.t certain property (like gender or racial elements), model distillation and compression [45,46], domain adaptation [47], feature selection for downstream tasks [48] and guiding architectural search etc. To develop a better understanding of the learned representations, along with presence of bias and redundancy, we carry a layer and neuron-level analysis on the speech models.…”
Section: Introductionmentioning
confidence: 99%