Machine translation is shifting to an end-to-end approach based on deep neural networks. The state of the art achieves impressive results for popular language pairs such as English -French or English -Chinese. However for English -Vietnamese the shortage of parallel corpora and expensive hyper-parameter search present practical challenges to neural-based approaches. This paper highlights our efforts on improving English-Vietnamese translations in two directions: (1) Building the largest open Vietnamese -English corpus to date, and (2) Extensive experiments with the latest neural models to achieve the highest BLEU scores. Our experiments provide practical examples of effectively employing different neural machine translation models with low-resource language pairs. IntroductionMachine translation is shifting to an end-to-end approach based on deep neural networks. Recent studies in neural machine translation (NMT) such as [41,2,42,14] have produced impressive advancements over phrase-based systems while eliminating the need for handengineered features. Most NMT systems are based on the encoder-decoder architecture which consists of two neural networks. The encoder compresses the source sequences into a real-valued vector, which is consumed by the decoder to generate the target sequences. The process is done in an end-to-end fashion, demonstrated the capability of learning representation directly from the training data.The typical sequence-to-sequence machine translation model consists of two recurrent neural networks (RNNs) and an attention mechanism [2,26]. Despite great improvements over traditional models [42,35,27] this architecture has certain shortcomings, namely that the recurrent networks are not easily parallelized and limited gradient flow while training deep models.Recent designs such as ConvS2S [14] and Transformer [41] can be better parallelized while producing better results on WMT datasets. However, NMT models take a long time to train and include many hyper-parameters. There is a number of works that tackle the problem of hyper-parameter selection [5,33] but they mostly focus on high-resource language pairs data, thus their findings may not translate well to low-resource translation tasks such as English-Vietnamese. Unlike in Computer Vision [17,20], the task of adapting parameters spaces from one NMT model to other NMT models is nearly impossible [5]. This reason limits researchers and engineers to reach good-chose hyper-parameters and well-trained models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.