A large-scale parallel corpus is required to train encoder-decoder neural machine translation. The method of using synthetic parallel texts, in which target monolingual corpora are automatically translated into source sentences, is effective in improving the decoder, but is unreliable for enhancing the encoder. In this paper, we propose a method that enhances the encoder and attention using target monolingual corpora by generating multiple source sentences via sampling. By using multiple source sentences, diversity close to that of humans is achieved. Our experimental results show that the translation quality is improved by increasing the number of synthetic source sentences for each given target sentence, and quality close to that using a manually created parallel corpus was achieved.
This paper presents a predicate-argument structure analysis that simultaneously conducts zero-anaphora resolution. By adding noun phrases as candidate arguments that are not only in the sentence of the target predicate but also outside of the sentence, our analyzer identifies arguments regardless of whether they appear in the sentence or not. Because we adopt discriminative models based on maximum entropy for argument identification, we can easily add new features. We add language model scores as well as contextual features. We also use contextual information to restrict candidate arguments.
In this paper, a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model is applied to Transformerbased neural machine translation (NMT). In contrast to monolingual tasks, the number of unlearned model parameters in an NMT decoder is as huge as the number of learned parameters in the BERT model. To train all the models appropriately, we employ twostage optimization, which first trains only the unlearned parameters by freezing the BERT model, and then fine-tunes all the sub-models. In our experiments, stable two-stage optimization was achieved, in contrast the BLEU scores of direct fine-tuning were extremely low. Consequently, the BLEU scores of the proposed method were better than those of the Transformer base model and the same model without pre-training. Additionally, we confirmed that NMT with the BERT encoder is more effective in low-resource settings.
The decomposition of N 2 O(a) was studied on Rh(110) at 95-200 K through the analysis of the angular distributions of desorbing N 2 by means of angle-resolved thermal desorption. N 2 O(a) was highly decomposed during the heating procedures, emitting N 2 (g) and releasing O(a). N 2 desorption showed four peaks, at 105-110 K (β 4 -N 2 ), 120-130 K (β 3 -N 2 ), 140-150 K (β 2 -N 2 ), and 160-165 K (β 1 -N 2 ). The appearance of each peak was sensitive to annealing after oxygen adsorption and also to the amount of N 2 O exposure. The β 1 -N 2 peak was major at low N 2 O exposures and showed a cosine distribution. On the other hand, β 2 -N 2 and β 3 -N 2 on an oxygen-modified surface revealed inclined and sharp collimation at around 30°off the surface normal in the plane along the [001] direction, whereas β 4 -N 2 on a clean surface collimated at around 70°off the surface normal, close to the [001] direction. An inclined or surface-parallel form of adsorbed N 2 O was proposed as the precursor for inclined N 2 desorption.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.