XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization

Raganato, Alessandro; Pasini, Tommaso; Camacho-Collados, José; Pilehvar, Mohammad Taher

doi:10.18653/v1/2020.emnlp-main.584

Cited by 29 publications

(43 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…WiC(Pilehvar and Camacho-Collados 2019) is the only SuperGLUE task where systems need to model the semantics of words in context (extended to several more languages in XL-WiC[Raganato et al 2020]). In the Appendix we provide results for this task.…”

mentioning

confidence: 99%

Analysis and Evaluation of Language Models for Word Sense Disambiguation

Loureiro

Rezaee

Pilehvar

et al. 2021

Computational Linguistics

Self Cite

View full text Add to dashboard Cite

Transformer-based language models have taken many fields in NLP by storm. BERT and its derivatives dominate most of the existing evaluation benchmarks, including those for Word Sense Disambiguation (WSD), thanks to their ability in capturing context-sensitive semantic nuances. However, there is still little knowledge about their capabilities and potential limitations in encoding and recovering word senses. In this article, we provide an in-depth quantitative and qualitative analysis of the celebrated BERT model with respect to lexical ambiguity. One of the main conclusions of our analysis is that BERT can accurately capture high-level sense distinctions, even when a limited number of examples is available for each word sense. Our analysis also reveals that in some cases language models come close to solving coarse-grained noun disambiguation under ideal conditions in terms of availability of training data and computing resources. However, this scenario rarely occurs in real-world settings and, hence, many practical challenges remain even in the coarse-grained setting. We also perform an in-depth comparison of the two main language model based WSD strategies, i.e., fine-tuning and feature extraction, finding that the latter approach is more robust with respect to sense bias and it can better exploit limited available training data. In fact, the simple feature extraction strategy of averaging contextualized embeddings proves robust even using only three training sentences per word sense, with minimal improvements obtained by increasing the size of this training data.

show abstract

mentioning

confidence: 99%

Analysis and Evaluation of Language Models for Word Sense Disambiguation

Loureiro

Rezaee

Pilehvar

et al. 2021

Computational Linguistics

Self Cite

View full text Add to dashboard Cite

show abstract

“…We use the crosslingual Wordin-Context dataset (XL-WiC; Raganato et al, 2020) with data in 12 diverse languages. The task is to predict whether an ambiguous word that appears in two different sentences share the same meaning.…”

Section: Word-in-contextmentioning

confidence: 99%

“…We follow Raganato et al (2020) and add a binary classification head on top of the pretrained MMLM model, which takes as input the concatenation of the target words' embedding in the two contexts. We use the output of the 24-th layer as the target words' representation.…”

Section: B12 Xl-wicmentioning

confidence: 99%

“…We use the state-of-the-art MMLM XLM-Rlarge (Conneau et al, 2020) and show that by adding an add-on training step using Wikipedia hyperlink prediction we consistently improve several zero-shot crosslingual natural language understanding tasks across a diverse array of languages: crosslingual Word Sense Disambiguation in 18 languages including English (XL-WSD; Pasini et al, 2021); the crosslingual Word-in-Context task (XL-WiC; Raganato et al, 2020) in 12 non-English languages; and in 7 tasks from the XTREME benchmark (Hu et al, 2020) in up to 40 languages.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Wikipedia Entities as Rendezvous across Languages: Grounding Multilingual Language Models by Predicting Wikipedia Hyperlinks

Calixto¹,

Raganato²,

Pasini³

2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

Self Cite

View full text Add to dashboard Cite

Masked language models have quickly become the de facto standard when processing text. Recently, several approaches have been proposed to further enrich word representations with external knowledge sources such as knowledge graphs. However, these models are devised and evaluated in a monolingual setting only. In this work, we propose a languageindependent entity prediction task as an intermediate training procedure to ground word representations on entity semantics and bridge the gap across different languages by means of a shared vocabulary of entities. We show that our approach effectively injects new lexicalsemantic knowledge into neural models, improving their performance on different semantic tasks in the zero-shot crosslingual setting. As an additional advantage, our intermediate training does not require any supplementary input, allowing our models to be applied to new datasets right away. In our experiments, we use Wikipedia articles in up to 100 languages and already observe consistent gains compared to strong baselines when predicting entities using only the English Wikipedia. Further adding extra languages lead to improvements in most tasks up to a certain point, but overall we found it non-trivial to scale improvements in model transferability by training on ever increasing amounts of Wikipedia languages. * * Work carried out while at the University of Rome "La Sapienza".

show abstract

“…Recently, as an application of Word Sense Disambiguation (WSD) (Navigli, 2009(Navigli, , 2012, Word-in-Context (WiC) disambiguation has been framed as a binary classification task to identify if the occurrences of a target word in two contexts correspond to the same meaning or not. The release of the WiC dataset (Pilehvar and Camacho-Collados, 2019), followed by the Multilingual Word-in-Context (XL-WiC) dataset (Raganato et al, 2020), has helped provide a common ground for evaluating and comparing systems while encouraging research in WSD and context-sensitive word embeddings.…”

Section: Introductionmentioning

confidence: 99%

Cambridge at SemEval-2021 Task 2: Neural WiC-Model with Data Augmentation and Exploration of Representation

Yuan¹,

Strohmaier²

2021

Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

View full text Add to dashboard Cite

This paper describes the system of the Cambridge team submitted to the SemEval-2021 shared task on Multilingual and Cross-lingual Word-in-Context Disambiguation. Building on top of a pre-trained masked language model, our system is first pre-trained on out-ofdomain data, and then fine-tuned on in-domain data. We demonstrate the effectiveness of the proposed two-step training strategy and the benefits of data augmentation from both existing examples and new resources. We further investigate different representations and show that the addition of distance-based features is helpful in the word-in-context disambiguation task. Our system yields highly competitive results in the cross-lingual track without training on any cross-lingual data; and achieves state-of-the-art results in the multilingual track, ranking first in two languages (Arabic and Russian) and second in French out of 171 submitted systems.

show abstract

XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization

Cited by 29 publications

References 34 publications

Analysis and Evaluation of Language Models for Word Sense Disambiguation

Analysis and Evaluation of Language Models for Word Sense Disambiguation

Wikipedia Entities as Rendezvous across Languages: Grounding Multilingual Language Models by Predicting Wikipedia Hyperlinks

Cambridge at SemEval-2021 Task 2: Neural WiC-Model with Data Augmentation and Exploration of Representation

Contact Info

Product

Resources

About