Proceedings of the 22nd Conference on Computational Natural Language Learning 2018
DOI: 10.18653/v1/k18-1056
|View full text |Cite
|
Sign up to set email alerts
|

Sequence to Sequence Mixture Model for Diverse Machine Translation

Abstract: Sequence to sequence (SEQ2SEQ) models often lack diversity in their generated translations. This can be attributed to the limitation of SEQ2SEQ models in capturing lexical and syntactic variations in a parallel corpus resulting from different styles, genres, topics, or ambiguity of the translation process. In this paper, we develop a novel sequence to sequence mixture (S2SMIX) model that improves both translation diversity and quality by adopting a committee of specialized translation models rather than a sing… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3
3

Relationship

1
8

Authors

Journals

citations
Cited by 41 publications
(29 citation statements)
references
References 22 publications
0
29
0
Order By: Relevance
“…Li et al (2016); Vijayakumar et al (2016) adjust decoding algorithms, adding different kinds of diversity regularization terms to encourage generating diverse outputs. He et al (2018); Shen et al (2019) utilize mixture of experts (MoE) method, using differentiated latent variables to control generation of translation. Sun et al (2019) generates diverse translation by sampling heads in encoder-decoder attention module in Transformer model, since different heads may present different target-source alignment.…”
Section: Analyzing Module Importance Withmentioning
confidence: 99%
See 1 more Smart Citation
“…Li et al (2016); Vijayakumar et al (2016) adjust decoding algorithms, adding different kinds of diversity regularization terms to encourage generating diverse outputs. He et al (2018); Shen et al (2019) utilize mixture of experts (MoE) method, using differentiated latent variables to control generation of translation. Sun et al (2019) generates diverse translation by sampling heads in encoder-decoder attention module in Transformer model, since different heads may present different target-source alignment.…”
Section: Analyzing Module Importance Withmentioning
confidence: 99%
“…Li et al (2016) and Vijayakumar et al (2016) proposed to add regularization terms to the beam search algorithm so that it can possess greater diversity. He et al (2018) and Shen et al (2019) introduced latent variables into the NMT model, thus the model can generate diverse outputs using different latent variables. Moreover, Sun et al (2019) proposed to combine the structural characteristics of Transformer and use the different weights between each head in the multi-head attention mechanism to obtain diverse results.…”
Section: Introductionmentioning
confidence: 99%
“…One of the most successful applications of MoE is ensemble learning (Caruana et al, 2004;Liu et al, 2018;Dutt et al, 2017, inter alia). Recent efforts also explore MoE in sequence learning , and to promote diversity in text generation (He et al, 2018;Shen et al, 2019;Cho et al, 2019, inter alia).…”
Section: Related Workmentioning
confidence: 99%
“…Prior work on diverse machine translation (He et al, 2018a;Shen et al, 2019;Cho et al, 2019) suggests to apply online hard EM by interleaving the following two steps for each mini-batch.…”
Section: Kb Module For Inferential Reasoningmentioning
confidence: 99%