Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions

Zeng, Ziheng; Bhat, Suma

doi:10.1162/tacl_a_00510

Cited by 3 publications

(17 citation statements)

References 44 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One line of evidence questioning this ability comes from patterns of similarity between noncompositional expressions. Zeng and Bhat (2022) extract mean-pooled idiom embeddings from BART and find that they cluster together based on surface or syntactic similarity rather than figurative meaning. Garcia et al (2021b) compare contextualized embeddings of compounds and their synonyms.…”

Section: Off-the-shelf Representationsmentioning

confidence: 99%

“…A computationally leaner approach consists in learning an adapter, as shown by Zeng and Bhat (2022) on BART idiom embeddings. They evaluate different adapters, with learning objectives that include reconstructing corrupted idiomatic sentences and increasing the similarity between the embeddings of idioms and their dictionary definitions.…”

Section: Optimized Representationsmentioning

confidence: 99%

See 1 more Smart Citation

Semantics of Multiword Expressions in Transformer-Based Models: A Survey

Miletić,

Walde

2024

Transactions of the Association for Computational Linguistics

View full text Add to dashboard Cite

Multiword expressions (MWEs) are composed of multiple words and exhibit variable degrees of compositionality. As such, their meanings are notoriously difficult to model, and it is unclear to what extent this issue affects transformer architectures. Addressing this gap, we provide the first in-depth survey of MWE processing with transformer models. We overall find that they capture MWE semantics inconsistently, as shown by reliance on surface patterns and memorized information. MWE meaning is also strongly localized, predominantly in early layers of the architecture. Representations benefit from specific linguistic properties, such as lower semantic idiosyncrasy and ambiguity of target expressions. Our findings overall question the ability of transformer models to robustly capture fine-grained semantics. Furthermore, we highlight the need for more directly comparable evaluation setups.

show abstract

Section: Off-the-shelf Representationsmentioning

confidence: 99%

Section: Optimized Representationsmentioning

confidence: 99%

Semantics of Multiword Expressions in Transformer-Based Models: A Survey

Miletić,

Walde

2024

Transactions of the Association for Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…Ideally, their representations ought to be distinct in these two contexts. However, examining the representation of 235 PIEs that are largely unrelated in their literal and idiomatic context (their literal PIE embeddings and idiomatic definitions have a mean cosine similarity of 0.0047), we notice that their representations generated by the state-ofthe-art (Zeng and Bhat, 2022) exhibit a high cosine similarity between their idiomatic and literal PIE embeddings (mean cosine similarity of 0.82).…”

Section: Introductionmentioning

confidence: 98%

“…Modern NLP systems, however, are primarily driven by the notion of compositionality, which is at the core of several system components, including tokenization (Sennrich et al, 2016;Wu et al, 2016) and the self-attention mechanism (Vaswani et al, 2017). More fundamentally, recent studies (Zeng and Bhat, 2022) reveal that the pre-trained language models (PTLMs), such as GPT-3 (Brown et al, 2020) and BART (Lewis et al, 2020), are ill-equipped to represent (and comprehend) idiomatic expressions' (IE) meanings. This is demonstrated by the lack of correspondence between the IE meanings and their embeddings; IEs with similar meanings are not close in the embedding space.…”

Section: Introductionmentioning

confidence: 99%

“…Efforts to generate semantically congruent representations for IEs are now coming to the fore. For instance, GIEA (Zeng and Bhat, 2022) uses a frozen pre-trained BART that is injected with trainable adapter layers (Houlsby et al, 2019;Pfeiffer et al, 2020a) to generate IE embeddings for non-compositional expressions. With better meaning-representation correspondence, the noncompositional expert GIEA performs better than BART in downstream IE processing tasks.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Unified Representation for Non-compositional and Compositional Expressions

Zeng,

Bhat

2023

Findings of the Association for Computational Linguistics: EMNLP 2023

View full text Add to dashboard Cite

Accurate processing of non-compositional language relies on generating good representations for such expressions. In this work, we study the representation of language noncompositionality by proposing a language model, PIER, that builds on BART and can create semantically meaningful and contextually appropriate representations for English potentially idiomatic expressions (PIEs). PIEs are characterized by their non-compositionality and contextual ambiguity in their literal and idiomatic interpretations. Via intrinsic evaluation on embedding quality and extrinsic evaluation on PIE processing and NLU tasks, we show that representations generated by PIER result in 33% higher homogeneity score for embedding clustering than BART, whereas 3.12% and 3.29% gains in accuracy and sequence accuracy for PIE sense classification and span detection compared to the state-ofthe-art IE representation model, GIEA. These gains are achieved without sacrificing PIER's performance on NLU tasks (+/-1% accuracy) compared to BART.

show abstract

IEKG: A Commonsense Knowledge Graph for Idiomatic Expressions

Zeng,

Cheng,

Nanniyur

et al. 2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Idiomatic expression (IE) processing and comprehension have challenged pre-trained language models (PTLMs) because their meanings are non-compositional. Unlike prior works that enable IE comprehension through finetuning PTLMs with sentences containing IEs, in this work, we construct IEKG, a commonsense knowledge graph for figurative interpretations of IEs. This extends the established ATOMIC 20 20 (Hwang et al., 2021) graph, converting PTLMs into knowledge models (KMs) that encode and infer commonsense knowledge related to IE use. Experiments show that various PTLMs can be converted into KMs with IEKG. We verify the quality of IEKG and the ability of the trained KMs with automatic and human evaluation. Through applications in natural language understanding, we show that a PTLM injected with knowledge from IEKG exhibits improved IE comprehension ability and can generalize to IEs unseen during training.

show abstract

Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions

Cited by 3 publications

References 44 publications

Semantics of Multiword Expressions in Transformer-Based Models: A Survey

Semantics of Multiword Expressions in Transformer-Based Models: A Survey

Unified Representation for Non-compositional and Compositional Expressions

IEKG: A Commonsense Knowledge Graph for Idiomatic Expressions

Contact Info

Product

Resources

About