Proceedings of the 17th Workshop on Multiword Expressions (MWE 2021) 2021
DOI: 10.18653/v1/2021.mwe-1.7
|View full text |Cite
|
Sign up to set email alerts
|

Finding BERT’s Idiomatic Key

Abstract: Sentence embeddings encode information relating to the usage of idioms in a sentence. This paper reports a set of experiments that combine a probing methodology with input masking to analyse where in a sentence this idiomatic information is taken from, and what form it takes. Our results indicate that BERT's idiomatic key is primarily found within an idiomatic expression, but also draws on information from the surrounding context. Also, BERT can distinguish between the disruption in a sentence caused by words … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 10 publications
(9 citation statements)
references
References 11 publications
0
9
0
Order By: Relevance
“…Probing assumes that the accuracy of the classification model (i.e., a probe) on the task indicates whether the embeddings encode information relevant to task target. There is a growing body of work using probing to examine what types of information are encoded in the embeddings created by Trans-former models (Hewitt and Manning, 2019;Liu et al, 2019;Tenney et al, 2019;Nedumpozhimana and Kelleher, 2021), and also exploring what layer in the Transformer architecture different types of information are encoded in (Jawahar et al, 2019). In this work, we adapt the probing methodology to speech embeddings, and use it to understand and compare the phonetic information encoded in different layers of a Transformer model.…”
Section: Related Workmentioning
confidence: 99%
“…Probing assumes that the accuracy of the classification model (i.e., a probe) on the task indicates whether the embeddings encode information relevant to task target. There is a growing body of work using probing to examine what types of information are encoded in the embeddings created by Trans-former models (Hewitt and Manning, 2019;Liu et al, 2019;Tenney et al, 2019;Nedumpozhimana and Kelleher, 2021), and also exploring what layer in the Transformer architecture different types of information are encoded in (Jawahar et al, 2019). In this work, we adapt the probing methodology to speech embeddings, and use it to understand and compare the phonetic information encoded in different layers of a Transformer model.…”
Section: Related Workmentioning
confidence: 99%
“…These latter methods have more recently been superseded by approaches making use of distributional similarity in the form of both static and contextualized word embeddings (Gharbieh et al, 2016;Ehren, 2017;Senaldi et al, 2019;Hashempour and Villavicencio, 2020;Fakharian, 2021;Garcia et al, 2021;Nedumpozhimana and Kelleher, 2021), while keeping the underlying assumption unchanged: the vector representation of the component words should be distant from the vector representation of the context or of the expression as a whole.…”
Section: Related Workmentioning
confidence: 99%
“…Finally, these latter methods have been superseded by approaches making use of distributional similarity in the form of both static and contextualized word embeddings (Gharbieh et al, 2016;Ehren, 2017;Senaldi et al, 2019;Liu and Hwa, 2019;Hashempour and Villavicencio, 2020;Kurfalı and Östling, 2020;Fakharian, 2021;Garcia et al, 2021;Nedumpozhimana and Kelleher, 2021), while keeping the underlying assumption unchanged, that is, the vector representation of the component words should be distant from the vector representation of the context, or of the expression as a whole.…”
Section: Related Workmentioning
confidence: 99%