A study of BERT for context-aware neural machine translation

Wu, Xueqing; Xia, Yingce; Zhu, Jinhua; Wu, Lijun; Xie, Shufang; Qin, Tao

doi:10.1007/s10994-021-06070-y

Cited by 19 publications

(10 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Among them, translators play an important role in translation. This is a research analysis of how to quickly identify a translator model with high accuracy and robustness [ 11 ]. The basic framework of machine translation is shown in Figure 2 .…”

Section: Methodsmentioning

confidence: 99%

English-Chinese Machine Translation Based on Transfer Learning and Chinese-English Corpus

2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

This paper proposes an English-Chinese machine translation research method based on transfer learning. First, it expounds the theory of neural machine translation and transfer learning and related technologies. Neural machine translation is discussed, the advantages and disadvantages of various models are introduced, and the transformer neural machine translation model framework is selected. For low-resource Chinese-English parallel corpus and Tibetan-Chinese parallel corpus, 30 million Chinese-English parallel corpora, 100,000 Chinese-English low-resource parallel corpora, and 100,000 Tibetan-Chinese parallel corpora were used to pretrain the transformer machine translation architecture. The decoders are all composed of 6 identical hidden layers, the initialization of the model parameters is done by the transformer uniform distribution, and the model training uses Adam as the optimizer. In the model transfer part, the parameters with the better effect of the pretrained model are transferred to the low-resource Chinese-English and Tibetan-Chinese machine translation model training, so as to achieve the purpose of knowledge transfer. The results show that the model transfer learning of low-resource Chinese-English parallel corpus improves the translation system’s translation by 3.97 BLEU values compared with the translation system without transfer learning at 0.34 BLEU values. Model transfer learning on low-resource Tibetan-Chinese parallel corpus increases the BLEU value by 2.64 BLEU compared to the translation system without transfer learning. The neural machine translation system that uses BPE technology for preprocessing plus model transfer learning is compared to the translation system that only performs transfer learning and shows an improved 0.26 BLEU value.It is verified that the transfer learning method proposed in this paper has a certain improvement in the effect of low-resource Chinese-English and Tibetan-Chinese neural machine translation models.

show abstract

Section: Methodsmentioning

confidence: 99%

English-Chinese Machine Translation Based on Transfer Learning and Chinese-English Corpus

2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

show abstract

“…Alternatively, sentence-level translations can be refined via reinforcement learning (Xiong et al, 2019;Mansimov et al, 2021) or monolingual repair to post-edit contextual errors in the target language (Voita et al, 2019a). More recently, the use of pretrained language models has been explored for the task, using them to encode the context (Wu et al, 2022) or to initialize NMT models (Huang et al, 2023). Other studies directly use Large Language Models to perform translations, showing that competitive results can be obtained with this approach, although they might still make critical errors in domains such as literature and sometimes perform worse than conventional NMT models in contrastive tests (Wang et al, 2023;Karpinska and Iyyer, 2023;Hendy et al, 2023).…”

Section: Related Workmentioning

confidence: 99%

TANDO^+: Corpus and Baselines for Document-level Machine Translation in Basque-Spanish and Basque-French

Gete,

Etchegoyhen,

Labaka

et al. 2024

Preprint

View full text Add to dashboard Cite

Context-aware Neural Machine Translation can potentially enhance automated translation quality through effective modelling of context beyond the sentence level. However, suitable corpora for contextual modelling are still scarce, presenting a significant challenge for the training and evaluation of context-aware systems. To address this challenge, we describe \textsc{tando^+}, a document-level corpus for the under-resourced language pairs Basque-French and Basque-Spanish. We provide a detailed description of this corpus, which is to be shared with the scientific community. The corpus comprises parallel data from diverse domains (literature, subtitles, and news) and incorporates context-level information. Additionally, it provides manually crafted contrastive test sets for Basque-Spanish, designed for comprehensive assessment of gender and register contextual phenomena. Additionally, we train and evaluate sentence-level baseline models and several state-of-the-art contextual variants. Our results and analyses indicate that the corpus is well-suited to train and evaluate context-aware machine translation systems for the two selected under-resourced language pairs.

show abstract

“…A high capacity teacher network predicts the soft label target for the labels for a small student network to learn from, hence distilling knowledge from the larger network into the smaller network (Hilton et al 2015) [114]. Other ways in which knowledge distillation has been implemented in NLP describe whether to be used for pre-training task DistillBERT, a distilled version of BERT (Sanh et al, 2019) [115], leveraged BERT to encode contextual information by aggregating features (Wu et al [116]). In this, it loops from teacher to student in language modeling with the task of predicting the mass token.…”

Section: Knowledge Distillationmentioning

confidence: 99%

Machine Translation Systems Based on Classical-Statistical-Deep-Learning Approaches

et al. 2023

View full text Add to dashboard Cite

Over recent years, machine translation has achieved astounding accomplishments. Machine translation has become more evident with the need to understand the information available on the internet in different languages and due to the up-scaled exchange in international trade. The enhanced computing speed due to advancements in the hardware components and easy accessibility of the monolingual and bilingual data are the significant factors that have added up to boost the success of machine translation. This paper investigates the machine translation models developed so far to the current state-of-the-art providing a solid understanding of different architectures with the comparative evaluation and future directions for the translation task. Because hybrid models, neural machine translation, and statistical machine translation are the types of machine translation that are utilized the most frequently, it is essential to have an understanding of how each one functions. A comprehensive comprehension of the several approaches to machine translation would be made possible as a result of this. In order to understand the advantages and disadvantages of the various approaches, it is necessary to conduct an in-depth comparison of several models on a variety of benchmark datasets. The accuracy of translations from multiple models is compared using metrics such as the BLEU score, TER score, and METEOR score.

show abstract

A study of BERT for context-aware neural machine translation

Cited by 19 publications

References 28 publications

English-Chinese Machine Translation Based on Transfer Learning and Chinese-English Corpus

English-Chinese Machine Translation Based on Transfer Learning and Chinese-English Corpus

TANDO^+: Corpus and Baselines for Document-level Machine Translation in Basque-Spanish and Basque-French

Machine Translation Systems Based on Classical-Statistical-Deep-Learning Approaches

Contact Info

Product

Resources

About