Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019) 2019
DOI: 10.18653/v1/s19-1002
|View full text |Cite
|
Sign up to set email alerts
|

Word Usage Similarity Estimation with Sentence Representations and Automatic Substitutes

Abstract: Usage similarity estimation addresses the semantic proximity of word instances in different contexts. We apply contextualized (ELMo and BERT) word and sentence embeddings to this task, and propose supervised models that leverage these representations for prediction. Our models are further assisted by lexical substitute annotations automatically assigned to word instances by context2vec, a neural model that relies on a bidirectional LSTM. We perform an extensive comparison of existing word and sentence represen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(9 citation statements)
references
References 19 publications
0
9
0
Order By: Relevance
“…A static embedding baseline (FastText) is also provided. BERT and RoBERTa are reported as the best models without external resources in WiC (Pilehvar and Camacho-Collados, 2019) and Usim(Garí Soler et al, 2019); the previous best reported score is 0.693(Neelakantan et al, 2014) for SCWS. (r: uncentered Pearson correlation, ρ: Spearman correlation, acc: Accuracy)…”
mentioning
confidence: 95%
“…A static embedding baseline (FastText) is also provided. BERT and RoBERTa are reported as the best models without external resources in WiC (Pilehvar and Camacho-Collados, 2019) and Usim(Garí Soler et al, 2019); the previous best reported score is 0.693(Neelakantan et al, 2014) for SCWS. (r: uncentered Pearson correlation, ρ: Spearman correlation, acc: Accuracy)…”
mentioning
confidence: 95%
“…We cluster each type of representation for w using k-means with a range of k values (2 ≤ k ≤ 10), and retain the k of the clustering with the highest mean SIL. Additionally, since BERT representations' cosine similarity correlates well with usage similarity (Garí Soler et al, 2019), we experiment with Agglomerative Clustering with average linkage directly on the cosine distance matrix obtained with BERT representations (BERT-AGG). For comparison, we also use Agglomerative Clustering on the gold usage similarity scores from Usim, transformed into distances (Gold-AGG).…”
Section: Sense Clusteringmentioning
confidence: 99%
“…Pre-trained LMs have been shown to successfully leverage sense annotated data for disambiguation (Wiedemann et al, 2019;Reif et al, 2019). The interplay between word type and token-level information in the hidden representations of LSTM LMs has also been explored (Aina et al, 2019), as well as the similarity estimates that can be drawn from contextualized representations without directly addressing word meaning (Ethayarajh, 2019). In recent work, Vulić et al (2020) probe BERT representations for lexical semantics, addressing out-of-context word similarity.…”
Section: Introductionmentioning
confidence: 99%
“…The dashed red lines indicate 1.0 context (right) and 1.0 target word bias, indicating the datasets require the modeling of target words alone or context alone. and Jorge, 2019; Huang et al, 2019;Blevins and Zettlemoyer, 2020), WiC (Pilehvar and Camacho-Collados, 2019;Garí Soler et al, 2019) and entity linking (EL) (Wu et al, 2020;Broscheit, 2019).…”
Section: Mcl-wicmentioning
confidence: 99%