Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2018
DOI: 10.18653/v1/p18-2053
|View full text |Cite
|
Sign up to set email alerts
|

Bag-of-Words as Target for Neural Machine Translation

Abstract: A sentence can be translated into more than one correct sentences. However, most of the existing neural machine translation models only use one of the correct translations as the targets, and the other correct sentences are punished as the incorrect sentences in the training stage. Since most of the correct translations for one sentence share the similar bag-of-words, it is possible to distinguish the correct translations from the incorrect ones by the bag-of-words. In this paper, we propose an approach that u… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
46
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
5

Relationship

2
8

Authors

Journals

citations
Cited by 69 publications
(46 citation statements)
references
References 19 publications
0
46
0
Order By: Relevance
“…Under reasoning-required setting, long answers are available in training but not inference phase. We use them as an additional signal for training: similar to Ma et al (2018) regularizing neural machine translation models with binary bag-of-word (BoW) statistics, we fine-tune BioBERT with an auxiliary task of predicting the binary BoW statistics of the long answers, also using the special [CLS] embedding. We minimize binary crossentropy loss of this auxiliary task:…”
Section: Long Answer As Additional Supervisionmentioning
confidence: 99%
“…Under reasoning-required setting, long answers are available in training but not inference phase. We use them as an additional signal for training: similar to Ma et al (2018) regularizing neural machine translation models with binary bag-of-word (BoW) statistics, we fine-tune BioBERT with an auxiliary task of predicting the binary BoW statistics of the long answers, also using the special [CLS] embedding. We minimize binary crossentropy loss of this auxiliary task:…”
Section: Long Answer As Additional Supervisionmentioning
confidence: 99%
“…Text generation is an important task in Natural Language Processing (NLP) as it lays the foundation for many applications, such as dialogue generation, machine translation (Ma et al, 2018b;), text summarization (Ma et al, 2018a), and table summarization (Liu et al, 2017). In these tasks, most of the systems are built upon the sequence-to-sequence paradigm (Sutskever et al, 2014), which is an end-to-end model that encodes a source sentence to a dense vector and then decodes the vector to a target sentence.…”
Section: Introductionmentioning
confidence: 99%
“…Bags of words have been used for instance in fraud detection [27]. More recently bag of words have been used successfully for translation by neural nets as a target for the translation as a sentence can be translated in many different ways [28]. In [29], multi-modal bag of words have been used for cross domains sentiment analysis.…”
Section: Multisetsmentioning
confidence: 99%