Roee Aharoni scite author profile

Multilingual neural machine translation (NMT) enables training a single model that supports translation from multiple source languages into multiple target languages. In this paper, we push the limits of multilingual NMT in terms of the number of languages being used. We perform extensive experiments in training massively multilingual NMT models, translating up to 102 languages to and from English within a single model. We explore different setups for training such models and analyze the trade-offs between translation quality and various modeling decisions. We report results on the publicly available TED talks multilingual corpus where we show that massively multilingual many-to-many models are effective in low resource settings, outperforming the previous state-of-the-art while supporting up to 59 languages. Our experiments on a large-scale dataset with 102 languages to and from English and up to one million examples per direction also show promising results, surpassing strong bilingual baselines and encouraging future work on massively multilingual NMT.

show abstract

Morphological Inflection Generation with Hard Monotonic Attention

Aharoni¹,

Goldberg²

2017

104

139

View full text Add to dashboard Cite

We present a neural model for morphological inflection generation which employs a hard attention mechanism, inspired by the nearly-monotonic alignment commonly found between the characters in a word and the characters in its inflection. We evaluate the model on three previously studied morphological inflection generation datasets and show that it provides state of the art results in various setups compared to previous neural and nonneural approaches. Finally we present an analysis of the continuous representations learned by both the hard and soft attention models for the task, shedding some light on the features such models extract.

show abstract

Towards String-To-Tree Neural Machine Translation

Aharoni¹,

Goldberg²

2017

124

113

View full text Add to dashboard Cite

We present a simple method to incorporate syntactic information about the target language in a neural machine translation system by translating into linearized, lexicalized constituency trees. Experiments on the WMT16 German-English news translation task shown improved BLEU scores when compared to a syntax-agnostic NMT baseline trained on the same dataset. An analysis of the translations from the syntax-aware system shows that it performs more reordering during translation in comparison to the baseline. A smallscale human evaluation also showed an advantage to the syntax-aware system.

show abstract

Unsupervised Domain Clusters in Pretrained Language Models

Aharoni¹,

Goldberg²

2020

126

105

View full text Add to dashboard Cite

The notion of "in-domain data" in NLP is often over-simplistic and vague, as textual data varies in many nuanced linguistic aspects such as topic, style or level of formality. In addition, domain labels are many times unavailable, making it challenging to build domainspecific systems. We show that massive pretrained language models implicitly learn sentence representations that cluster by domains without supervision -suggesting a simple datadriven definition of domains in textual data. We harness this property and propose domain data selection methods based on such models, which require only a small set of in-domain monolingual data. We evaluate our data selection methods for neural machine translation across five diverse domains, where they outperform an established approach as measured by both BLEU and by precision and recall of sentence selection with respect to an oracle.

show abstract

Split and Rephrase: Better Evaluation and Stronger Baselines

Aharoni¹,

Goldberg

2018

View full text Add to dashboard Cite

Splitting and rephrasing a complex sentence into several shorter sentences that convey the same meaning is a challenging problem in NLP. We show that while vanilla seq2seq models can reach high scores on the proposed benchmark (Narayan et al., 2017), they suffer from memorization of the training set which contains more than 89% of the unique simple sentences from the validation and test sets. To aid this, we present a new train-development-test data split and neural models augmented with a copymechanism, outperforming the best reported baseline by 8.68 BLEU and fostering further progress on the task.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Roee Aharoni

Massively Multilingual Neural Machine Translation

Morphological Inflection Generation with Hard Monotonic Attention

Towards String-To-Tree Neural Machine Translation

Unsupervised Domain Clusters in Pretrained Language Models

Split and Rephrase: Better Evaluation and Stronger Baselines

Contact Info

Product

Resources

About