Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.365
|View full text |Cite
|
Sign up to set email alerts
|

Analysing Lexical Semantic Change with Contextualised Word Representations

Abstract: This paper presents the first unsupervised approach to lexical semantic change that makes use of contextualised word representations. We propose a novel method that exploits the BERT neural language model to obtain representations of word usages, clusters these representations into usage types, and measures change along time with three proposed metrics. We create a new evaluation dataset and show that the model representations and the detected semantic shifts are positively correlated with human judgements. Ou… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
111
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 116 publications
(112 citation statements)
references
References 39 publications
1
111
0
Order By: Relevance
“…This provides new opportunities for diachronic analysis: for example, it is possible to group similar token representations and measure a diversity of such representations, while predefined number of senses is not strictly necessary. Thus, currently there is an increased interest in the topic of language change detection using contextualized word embeddings [9,10,14,21,27,28].…”
Section: Contextualized Word Embeddingsmentioning
confidence: 99%
See 3 more Smart Citations
“…This provides new opportunities for diachronic analysis: for example, it is possible to group similar token representations and measure a diversity of such representations, while predefined number of senses is not strictly necessary. Thus, currently there is an increased interest in the topic of language change detection using contextualized word embeddings [9,10,14,21,27,28].…”
Section: Contextualized Word Embeddingsmentioning
confidence: 99%
“…Evolution of each word was measured by comparing distributions of senses in different time slices. In [9] and [10], a pre-trained BERT model was used to obtain representations of word usages in an unsupervised fashion, without predefined list or number of senses. Representations with similar usages then were clustered using k-Means algorithm and distributions of word's usages in these clusters were used in two metrics for quantifying the degree of semantic change: entropy difference and Jensen-Shannon divergence.…”
Section: Contextualized Word Embeddingsmentioning
confidence: 99%
See 2 more Smart Citations
“…This criterion can be viewed as a form of linguistic generalization, and if satisfied, enables downstream models to produce consistent results across related words. To test this criterion, we compute the vector similarities between the contextualized representations of complex words and their components, a method that coheres with human judgments of contextual semantic similarity (Giulianelli et al, 2020). We probe BERT 11 with synthetic inputs constructed by replacing each complex word with its spacedelimited bases.…”
Section: Blends In Contextmentioning
confidence: 99%