2021
DOI: 10.1162/tacl_a_00400
|View full text |Cite
|
Sign up to set email alerts
|

Let’s PlayMono-Poly: BERT Can Reveal Words’ Polysemy Level and Partitionability into Senses

Abstract: Pre-trained language models (LMs) encode rich information about linguistic structure but their knowledge about lexical polysemy remains unclear. We propose a novel experimental setup for analyzing this knowledge in LMs specifically trained for different languages (English, French, Spanish, and Greek) and in multilingual BERT. We perform our analysis on datasets carefully designed to reflect different sense distributions, and control for parameters that are highly correlated with polysemy such as frequency and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 24 publications
(7 citation statements)
references
References 45 publications
0
7
0
Order By: Relevance
“…PLMs, unlike embedding methods such as GloVe and word2vec, have been shown to learn a good estimation of the different senses of words during pre-training (Vulić et al 2020;Garí Soler & Apidianaki 2021;Haber & Poesio 2021). In addition, significant gains have been made on the task of word sense disambiguation by the use of PLMs (Loureiro & Jorge 2019;Loureiro et al 2022), further reinforcing this notion.…”
Section: The Emergence Of Construction Grammar In Pre-trained Languag...mentioning
confidence: 95%
“…PLMs, unlike embedding methods such as GloVe and word2vec, have been shown to learn a good estimation of the different senses of words during pre-training (Vulić et al 2020;Garí Soler & Apidianaki 2021;Haber & Poesio 2021). In addition, significant gains have been made on the task of word sense disambiguation by the use of PLMs (Loureiro & Jorge 2019;Loureiro et al 2022), further reinforcing this notion.…”
Section: The Emergence Of Construction Grammar In Pre-trained Languag...mentioning
confidence: 95%
“…However, the results are not straightforward. Whereas some research found that polysemy detection, differentiation between polysemous words regarding NoS (Yenicelik et al, 2020), and even sense disambiguation is possible within transformer models (Soler & Apidianaki, 2021), some claim that structures that are present in these pre-trained models are so seamless that information extraction is impossible (Wiedemann et al, 2019). The issue with these models is that the more advanced and better at disambiguation, the more technically complex they are.…”
Section: Describing Sensesmentioning
confidence: 99%
“…These contextualized embeddings have been shown to improve performance on a number of downstream Natural Language Processing tasks involving lexical ambiguity, such as word sense disambiguation (Aina et al, 2019; Loureiro et al, 2020). Past work also suggests that BERT can be used to distinguish monosemous and polysemous words, or even polysemy and homonymy (Haber & Poesio, 2020a, 2020b; Nair et al, 2020; Soler & Apidianaki, 2021), and that BERT’s representations encode sense-like information (Karidi et al, 2021). Most relevantly for our purposes, BERT’s contextualized embeddings are well-suited for measuring contextual distance in a graded manner––given two contextualized embeddings of an ambiguous target word (e.g., for “marinated lamb ” and “friendly lamb ”), we can compute the cosine distance between those vectors, a metric often used to assess proximity in vector space 4 .…”
Section: Current Workmentioning
confidence: 99%