Findings of the Association for Computational Linguistics: EMNLP 2021 2021
DOI: 10.18653/v1/2021.findings-emnlp.294
|View full text |Cite
|
Sign up to set email alerts
|

AStitchInLanguageModels: Dataset and Methods for the Exploration of Idiomaticity in Pre-Trained Language Models

Abstract: Despite their success in a variety of NLP tasks, pre-trained language models, due to their heavy reliance on compositionality, fail in effectively capturing the meanings of multiword expressions (MWEs), especially idioms. Therefore, datasets and methods to improve the representation of MWEs are urgently needed. Existing datasets are limited to providing the degree of idiomaticity of expressions along with the literal and, where applicable, (a single) non-literal interpretation of MWEs. This work presents a nov… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
46
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

3
5

Authors

Journals

citations
Cited by 23 publications
(51 citation statements)
references
References 22 publications
1
46
1
Order By: Relevance
“…Table 3 presents the results of our best PET-based models alongside our BERTRAM-based model on the test set, as well as the mBERT system presented in (Tayyar Madabushi et al, 2022), for comparison. For each model we present the F1 macro score on the test set for each language, as well as the overall F1 macro score.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Table 3 presents the results of our best PET-based models alongside our BERTRAM-based model on the test set, as well as the mBERT system presented in (Tayyar Madabushi et al, 2022), for comparison. For each model we present the F1 macro score on the test set for each language, as well as the overall F1 macro score.…”
Section: Resultsmentioning
confidence: 99%
“…In evaluating the models presented in this work we use the Task 2 of SemEval 2022: Multilingual Idiomaticity Detection and Sentence Embedding (Tayyar Madabushi et al, 2022).…”
Section: Dataset and Task Descriptionmentioning
confidence: 99%
See 1 more Smart Citation
“…We use the training and development splits from Tayyar Madabushi et al (2021) with the addition of Galician data, and use the test split released by them as the evaluation split during the initial practice phase of the competition. We create an independent test set consisting of examples with new MWEs, and this set was used to determine the teams' final rankings.…”
Section: The Competition Datasetmentioning
confidence: 99%
“…Understanding the semantic meaning of a sentence requires the correct identification of the MWE in the sentence. Table 1 contains one case for each of the two usages (Tayyar Madabushi et al, 2021).…”
Section: Introductionmentioning
confidence: 99%