Proceedings of the Third Conference on Machine Translation: Research Papers 2018
DOI: 10.18653/v1/w18-6321
|View full text |Cite
|
Sign up to set email alerts
|

Simple Fusion: Return of the Language Model

Abstract: Neural Machine Translation (NMT) typically leverages monolingual data in training through backtranslation. We investigate an alternative simple method to use monolingual data for NMT training: We combine the scores of a pre-trained and fixed language model (LM) with the scores of a translation model (TM) while the TM is trained from scratch. To achieve that, we train the translation model to predict the residual probability of the training data added to the prediction of the LM. This enables the TM to focus it… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
60
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 54 publications
(60 citation statements)
references
References 25 publications
0
60
0
Order By: Relevance
“…Application of two countermeasures, GROVER and GLTR, to the detection of fake reviews demonstrated detection accuracy of around 90%. We plan to investigate ways to further preserve both sentiment and context information by using cold fusion [33] or simple fusion [34]. Since the generated reviews is the most probable sequence, they lack diversity and the corresponding distribution area may be already covered by the countermeasures.…”
Section: Discussionmentioning
confidence: 99%
“…Application of two countermeasures, GROVER and GLTR, to the detection of fake reviews demonstrated detection accuracy of around 90%. We plan to investigate ways to further preserve both sentiment and context information by using cold fusion [33] or simple fusion [34]. Since the generated reviews is the most probable sequence, they lack diversity and the corresponding distribution area may be already covered by the countermeasures.…”
Section: Discussionmentioning
confidence: 99%
“…In contrast to parallel data, large quantity of monolingual data can be easily collected for many languages. Previous work proposed several ways to take advantage of monolingual data in order to improve translation models trained on parallel data, such as a separately trained language model to be integrated to the NMT system architecture (Gulcehre et al 2017;Stahlberg et al 2018) or exploiting monolingual data to create synthetic parallel data. The latter was first proposed for SMT (Schwenk 2008;Lambert et al 2011), but remains the most prevalent one in NMT (Barrault et al 2019) due to its simplicity and effectiveness.…”
Section: Low-resource Mtmentioning
confidence: 99%
“…Reference [5] draw an empirical roadmap to observe how the amounts of BT data impact the performance of the final system, they further investigate more factors of BT data in different SMT and NMT approaches, as well as the amounts of data [6]. Although back translation is proved effective, some works like [7] have demonstrated that back translation increasing the amount of monolingual data improves the translation quality only up to some point, and then it starts to degrade.…”
Section: B Back Translationmentioning
confidence: 99%