Makoto Morishita scite author profile

Makoto Morishita

5Publications

90Citation Statements Received

102Citation Statements Given

How they've been cited

How they cite others

100

102

Affiliations

NTT (Japan), NTT (United States), Tohoku University

Publications

Order By: Most citations

NTT Neural Machine Translation Systems at WAT 2019

Morishita

Suzuki

Nagata

2019

View full text Add to dashboard Cite

In this paper, we describe our systems that were submitted to the translation shared tasks at WAT 2019. This year, we participated in two distinct types of subtasks, a scientific paper subtask and a timely disclosure subtask, where we only considered English-to-Japanese and Japanese-to-English translation directions. We submitted two systems (En-Ja and Ja-En) for the scientific paper subtask and two systems (Ja-En, texts, items) for the timely disclosure subtask. Three of our four systems obtained the best human evaluation performances. We also confirmed that our new additional web-crawled parallel corpus improves the performance in unconstrained settings.

show abstract

An Empirical Study of Mini-Batch Creation Strategies for Neural Machine Translation

Morishita¹,

Oda²,

Neubig³

et al. 2017

View full text Add to dashboard Cite

Training of neural machine translation (NMT) models usually uses mini-batches for efficiency purposes. During the minibatched training process, it is necessary to pad shorter sentences in a mini-batch to be equal in length to the longest sentence therein for efficient computation. Previous work has noted that sorting the corpus based on the sentence length before making mini-batches reduces the amount of padding and increases the processing speed. However, despite the fact that mini-batch creation is an essential step in NMT training, widely used NMT toolkits implement disparate strategies for doing so, which have not been empirically validated or compared. This work investigates mini-batch creation strategies with experiments over two different datasets. Our results suggest that the choice of a minibatch creation strategy has a large effect on NMT training and some length-based sorting strategies do not always work well compared with simple shuffling.

show abstract

NTT’s Neural Machine Translation Systems for WMT 2018

Morishita¹,

Suzuki²,

Nagata³

2018

View full text Add to dashboard Cite

This paper describes NTT's neural machine translation systems submitted to the WMT 2018 English-German and German-English news translation tasks. Our submission has three main components: the Transformer model, corpus cleaning, and right-to-left nbest re-ranking techniques. Through our experiments, we identified two keys for improving accuracy: filtering noisy training sentences and right-to-left re-ranking. We also found that the Transformer model requires more training data than the RNN-based model, and the RNN-based model sometimes achieves better accuracy than the Transformer model when the corpus is small.

show abstract

An Empirical Study of Building a Strong Baseline for Constituency Parsing

Suzuki

Takase

Kamigaito

et al. 2018

View full text Add to dashboard Cite

This paper investigates the construction of a strong baseline based on general purpose sequence-to-sequence models for constituency parsing. We incorporate several techniques that were mainly developed in natural language generation tasks, e.g., machine translation and summarization, and demonstrate that the sequenceto-sequence model achieves the current top-notch parsers' performance without requiring explicit task-specific knowledge or architecture of constituent parsing.

show abstract

NTT’s Machine Translation Systems for WMT19 Robustness Task

Murakami¹,

Morishita²,

Hirao³

et al. 2019

View full text Add to dashboard Cite

This paper describes NTT's submission to the WMT19 robustness task. This task mainly focuses on translating noisy text (e.g., posts on Twitter), which presents different difficulties from typical translation tasks such as news. Our submission combined techniques including utilization of a synthetic corpus, domain adaptation, and a placeholder mechanism, which significantly improved over the previous baseline. Experimental results revealed the placeholder mechanism, which temporarily replaces the non-standard tokens including emojis and emoticons with special placeholder tokens during translation, improves translation accuracy even with noisy texts.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.