Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing

Schuster, Tal; Ram, Ori; Barzilay, Regina; Globerson, Amir

doi:10.48550/arxiv.1902.09492

Cited by 9 publications

(12 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Cross-lingual embedding alignment find that independently trained monolingual word embedding spaces in ELMo are isometric under rotation. Similarly, Schuster et al (2019) and Wang et al (2019) geometrically align contextualized word embeddings trained independently. find that cross-lingual transfer in mBERT is possible even without shared vocabulary tokens, which they attribute to this isometricity.…”

Section: Related Workmentioning

confidence: 99%

Finding Universal Grammatical Relations in Multilingual BERT

Chi¹,

Hewitt²,

Manning³

2020

Preprint

View full text Add to dashboard Cite

Recent work has found evidence that Multilingual BERT (mBERT), a transformer-based multilingual masked language model, is capable of zero-shot cross-lingual transfer, suggesting that some aspects of its representations are shared cross-lingually. To better understand this overlap, we extend recent work on finding syntactic trees in neural networks' internal representations to the multilingual setting. We show that subspaces of mBERT representations recover syntactic tree distances in languages other than English, and that these subspaces are approximately shared across languages. Motivated by these results, we present an unsupervised analysis method that provides evidence mBERT learns representations of syntactic dependency labels, in the form of clusters which largely agree with the Universal Dependencies taxonomy. This evidence suggests that even without explicit supervision, multilingual masked language models learn certain linguistic universals.

show abstract

Section: Related Workmentioning

confidence: 99%

Finding Universal Grammatical Relations in Multilingual BERT

Chi¹,

Hewitt²,

Manning³

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…With the enhancement of the TF-IDF algorithm, bag-of-words addition model further achieves a performance improvement, and the term-by-term query translation model yields the state-ofthe-art performance for unsupervised cross-lingual retrieval (Litschko et al 2018). Another group of methods transfer the cross-lingual retrieval task to a monolingual retrieval task by using machine translation systems (Schuster et al 2019), e.g., combing cross-language tree kernel align with neural machine translation system for retrieval (Da San Martino et al 2017), learning language invariant representations for cross-lingual question re-ranking .…”

Section: Related Work Information Retrieval Modelsmentioning

confidence: 99%

“…In cross-lingual mapping learning, a linear mapping between the source embeddings space and the target embeddings space is learned in an adversarial fashion (Conneau et al 2018a). To enhance the quality of learned bilingual word embeddings, various refinement strategies are proposed, such as synthetic parallel vocabulary building (Artetxe, Labaka, and Agirre 2017), orthogonal constraint (Smith et al 2017), cross-domain similarity local scaling (Søgaard, Ruder, and Vulić 2018), self-boosting (Artetxe, Labaka, and Agirre 2018), byte-pair encodings (Sennrich, Haddow, and Birch 2016;Lample et al 2018). Alternatively, a context-dependent cross-lingual representation mapping based on pre-trained ELMo (Peters et al 2018) is proposed recently to boost the performance of cross-lingual learning.…”

Section: Low-resource Cross-lingual Learningmentioning

confidence: 99%

“…Alternatively, a context-dependent cross-lingual representation mapping based on pre-trained ELMo (Peters et al 2018) is proposed recently to boost the performance of cross-lingual learning. (Schuster et al 2019). Unlike previous work, we propose to train a cross-lingual mapping upon the context-dependent monolingual BERT with an effective refinement strategy.…”

Section: Low-resource Cross-lingual Learningmentioning

confidence: 99%

“…Different from the case in context-independent word embeddings, the contextualized embedding of a word varies with the surrounding contexts. In order to draw on the effective off-the-shelf cross-lingual alignment algorithms, we bridge this gap by considering the averaged contextualized representations as the anchor(i.e., the static embedding) for each word in a monolingual corpus following Schuster et al (2019). Then we adopt MUSE (Conneau et al 2018a) to learn the cross-lingual alignment matrix W as the initialization of the mapping matrix of our model.…”

Section: Bidirectional Cross-lingual Alignmentmentioning

confidence: 99%

See 2 more Smart Citations

Cross-Lingual Low-Resource Set-to-Description Retrieval for Global E-Commerce

Liu

Wang

et al. 2020

Preprint

View full text Add to dashboard Cite

With the prosperous of cross-border e-commerce, there is an urgent demand for designing intelligent approaches for assisting e-commerce sellers to offer local products for consumers from all over the world. In this paper, we explore a new task of cross-lingual information retrieval, i.e., cross-lingual set-todescription retrieval in cross-border e-commerce, which involves matching product attribute sets in the source language with persuasive product descriptions in the target language. We manually collect a new and high-quality paired dataset, where each pair contains an unordered product attribute set in the source language and an informative product description in the target language. As the dataset construction process is both time-consuming and costly, the new dataset only comprises of 13.5k pairs, which is a low-resource setting and can be viewed as a challenging testbed for model development and evaluation in cross-border e-commerce. To tackle this cross-lingual set-to-description retrieval task, we propose a novel cross-lingual matching network (CLMN) with the enhancement of context-dependent cross-lingual mapping upon the pre-trained monolingual BERT representations. Experimental results indicate that our proposed CLMN yields impressive results on the challenging task and the contextdependent cross-lingual mapping on BERT yields noticeable improvement over the pre-trained multi-lingual BERT model.

show abstract

Unsupervised Cross-lingual Representation Learning at Scale

Conneau

Khandelwal

Goyal

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

2,942

2,410

View full text Add to dashboard Cite

This paper shows that pretraining multilingual language models at scale leads to significant performance gains for a wide range of crosslingual transfer tasks. We train a Transformerbased masked language model on one hundred languages, using more than two terabytes of filtered CommonCrawl data. Our model, dubbed XLM-R, significantly outperforms multilingual BERT (mBERT) on a variety of cross-lingual benchmarks, including +14.6% average accuracy on XNLI, +13% average F1 score on MLQA, and +2.4% F1 score on NER. XLM-R performs particularly well on low-resource languages, improving 15.7% in XNLI accuracy for Swahili and 11.4% for Urdu over previous XLM models. We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale. Finally, we show, for the first time, the possibility of multilingual modeling without sacrificing perlanguage performance; XLM-R is very competitive with strong monolingual models on the GLUE and XNLI benchmarks. We will make our code, data and models publicly available. 1

show abstract

Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing

Cited by 9 publications

References 0 publications

Finding Universal Grammatical Relations in Multilingual BERT

Finding Universal Grammatical Relations in Multilingual BERT

Cross-Lingual Low-Resource Set-to-Description Retrieval for Global E-Commerce

Unsupervised Cross-lingual Representation Learning at Scale

Contact Info

Product

Resources

About