Pragmatic competence of pre-trained language models through the lens of discourse connectives

Pandia, Lalchand; Yan, Cong; Ettinger, Allyson

doi:10.18653/v1/2021.conll-1.29

Cited by 21 publications

(13 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…More tradi- 2021) by taking a closer look at the internal workings of the selfattention component. Looking at prior work analyzing the amount of discourse information in PLMs, structures are solely explored through the use of proxy tasks, such as connective prediction (Pandia et al, 2021), relation classification (Kurfalı and Östling, 2021), and others (Koto et al, 2021a). However, despite the difficulties of encoding arbitrarily long documents, we believe that to systematically explore the relationship between PLMs and discourse, considering complete documents is imperative.…”

Section: Related Workmentioning

confidence: 99%

“…Besides their strong empirical results on most real-world problems, such as summarization (Zhang et al, 2020;Xiao et al, 2021a), questionanswering (Joshi et al, 2020;Oguz et al, 2021) and sentiment analysis (Adhikari et al, 2019;, uncovering what kind of linguistic knowledge is captured by this new type of pretrained language models (PLMs) has become a prominent question by itself. As part of this line of research, called BERTology (Rogers et al, 2020), researchers explore the amount of linguistic understanding encapsulated in PLMs, exposed through either external probing tasks (Raganato and Tiedemann, 2018;Zhu et al, 2020;Koto et al, 2021a) or unsupervised methods (Wu et al, 2020;Pandia et al, 2021). Previous work thereby either focuses on analyzing the syntactic structures (e.g., Hewitt and Manning (2019); Wu et al (2020)), relations (Papanikolaou et al, 2019), ontologies (Michael et al, 2020) or, to a more limited extend, discourse related behaviour (Zhu et al, 2020;Koto et al, 2021a;Pandia et al, 2021).…”

Section: Introductionmentioning

confidence: 99%

“…As part of this line of research, called BERTology (Rogers et al, 2020), researchers explore the amount of linguistic understanding encapsulated in PLMs, exposed through either external probing tasks (Raganato and Tiedemann, 2018;Zhu et al, 2020;Koto et al, 2021a) or unsupervised methods (Wu et al, 2020;Pandia et al, 2021). Previous work thereby either focuses on analyzing the syntactic structures (e.g., Hewitt and Manning (2019); Wu et al (2020)), relations (Papanikolaou et al, 2019), ontologies (Michael et al, 2020) or, to a more limited extend, discourse related behaviour (Zhu et al, 2020;Koto et al, 2021a;Pandia et al, 2021).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models

Huber¹,

Carenini²

2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

With a growing number of BERTology works analyzing different components of pre-trained language models, we extend this line of research through an in-depth analysis of discourse information in pre-trained and finetuned language models. We move beyond prior work along three dimensions: First, we describe a novel approach to infer discourse structures from arbitrarily long documents. Second, we propose a new type of analysis to explore where and how accurately intrinsic discourse is captured in the BERT and BART models. Finally, we assess how similar the generated structures are to a variety of baselines as well as their distributions within and between models.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models

Huber¹,

Carenini²

2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

show abstract

“…To our knowledge, Pandia et al (2021) is the only NLP study dealing with conjuction buttressing: the authors tested if Transformer-based masked language models can predict the temporal connective corresponding to the correct interpretation of the enriched and, using the stimuli by Politzer-Ahles et al (2017). Unlike their study, we created and used labeled data for the evaluation of NLI systems, testing a pragmatic hypothesis (enriched interpretation of and) vs. a logical one (commutative interpretation).…”

Section: Related Workmentioning

confidence: 99%

Pragmatic and Logical Inferences in NLI Systems: The Case of Conjunction Buttressing

Pedinotti¹,

Chersoni²,

Santus³

et al. 2022

Proceedings of the Second Workshop on Understanding Implicit and Underspecified Language

View full text Add to dashboard Cite

An intelligent system is expected to perform reasonable inferences, accounting for both the literal meaning of a word and the meanings a word can acquire in different contexts. A specific kind of inference concerns the connective and, which in some cases gives rise to a temporal succession or causal interpretation in contrast with the logic, commutative one (Levinson, 2000). In this work, we investigate the phenomenon by creating a new dataset for evaluating the interpretation of and by NLI systems, which we use to test three Transformer-based models. Our results show that all systems generalize patterns that are consistent with both the logical and the pragmatic interpretation, perform inferences that are inconsistent with each other, and show clear divergences with both theoretical accounts and humans' behavior.

show abstract

“…A lot of attention has been paid to increase LMs' general transparency (Ettinger, 2020;Rogers et al, 2020), among which studies on LMs' interpretation of implicitness mostly focus on scalar implicature or presupposition (Schuster et al, 2020;Jeretic et al, 2020;Pandia et al, 2021). To our knowledge, no studies in this line have been done on gradable adjectives' EVAL implicature, although EVAL and gradability are classic topics in context sensitivity.…”

Section: Introductionmentioning

confidence: 99%

Proceedings of the Second Workshop on Understanding Implicit and Underspecified Language

2022

View full text Add to dashboard Cite

By saying Maria is tall, a human speaker typically implies that Maria is evaluatively tall from the speaker's perspective. However, by using a different construction Maria is taller than Sophie, we cannot infer from Maria and Sophie's relative heights that Maria is evaluatively tall because it is possible for Maria to be taller than Sophie in a context in which they both count as short. Can pre-trained language models (LMs) "understand" evaulativity (EVAL) inference? To what extent can they discern the EVAL salience of different constructions in a conversation? Will it help LMs' implicitness performance if we give LMs a persona such as chill, social, and pragmatically skilled? Our study provides an approach to probing LMs' interpretation of EVAL inference by incorporating insights from experimental pragmatics and sociolinguistics. We find that with the appropriate prompt, LMs can succeed in some pragmatic level language understanding tasks. Our study suggests that socio-pragmatics methodology can shed light on the challenging questions in NLP.Andrea Beltrama and Florian Schwarz. 2021. Imprecision, personae, and pragmatic reasoning. In Semantics and Linguistic Theory, volume 31, pages 122-144. Manfred Bierwisch. 1989. The semantics of gradation. bierwisch, manfred & ewald lang (eds.), dimensional adjectives.

show abstract

Pragmatic competence of pre-trained language models through the lens of discourse connectives

Cited by 21 publications

References 17 publications

Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models

Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models

Pragmatic and Logical Inferences in NLI Systems: The Case of Conjunction Buttressing

Proceedings of the Second Workshop on Understanding Implicit and Underspecified Language

Contact Info

Product

Resources

About