Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2021
DOI: 10.18653/v1/2021.naacl-main.301
|View full text |Cite
|
Sign up to set email alerts
|

Discourse Probing of Pretrained Language Models

Abstract: Existing work on probing of pretrained language models (LMs) has predominantly focused on sentence-level syntactic tasks. In this paper, we introduce document-level discourse probing to evaluate the ability of pretrained LMs to capture document-level relations. We experiment with 7 pretrained LMs, 4 languages, and 7 discourse probing tasks, and find BART to be overall the best model at capturing discourse -but only in its encoder, with BERT performing surprisingly well as the baseline model. Across the differe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
20
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 23 publications
(31 citation statements)
references
References 23 publications
0
20
0
Order By: Relevance
“…For instance, regarding the adaptability evaluation for Spanish models, Cañete et al (2020) recently proposed GLUES, a Spanish version of GLUE. In the case of representation evaluation, most of the work is in a cross-linguistic setting for word (S ¸ahin et al, 2020), sentence (Ravishankar et al, 2019) and discourse (Koto et al, 2021) evaluations. For this reason and following the motivation of works such as RuSen-tEval (Mikhailov et al, 2021), we provide SentEval and DiscoEval in Spanish, which consists of tasks originally created with texts in Spanish and aimed at evaluating models of that language.…”
Section: Language Model Evaluationsmentioning
confidence: 99%
“…For instance, regarding the adaptability evaluation for Spanish models, Cañete et al (2020) recently proposed GLUES, a Spanish version of GLUE. In the case of representation evaluation, most of the work is in a cross-linguistic setting for word (S ¸ahin et al, 2020), sentence (Ravishankar et al, 2019) and discourse (Koto et al, 2021) evaluations. For this reason and following the motivation of works such as RuSen-tEval (Mikhailov et al, 2021), we provide SentEval and DiscoEval in Spanish, which consists of tasks originally created with texts in Spanish and aimed at evaluating models of that language.…”
Section: Language Model Evaluationsmentioning
confidence: 99%
“…At the base of our work are two of the most popular and frequently used PLMs: BERT (Devlin et al, 2018) and BART (Lewis et al, 2020). We choose these two popular approaches in our study due to their complementary nature (encoder-only vs. encoder-decoder) and based on previous work by Zhu et al (2020) and Koto et al (2021a), showing the effectiveness of BERT and BART models for discourse related tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Besides their strong empirical results on most real-world problems, such as summarization (Zhang et al, 2020;Xiao et al, 2021a), question-answering (Joshi et al, 2020;Oguz et al, 2021) and sentiment analysis (Adhikari et al, 2019;, uncovering what kind of linguistic knowledge is captured by this new type of pre-trained language models (PLMs) has become a prominent question by itself. As part of this line of research, called BERTology (Rogers et al, 2020), researchers explore the amount of linguistic understanding encapsulated in PLMs, exposed through either external probing tasks (Raganato and Tiedemann, 2018;Zhu et al, 2020;Koto et al, 2021a) or unsupervised methods (Wu et al, 2020;Pandia et al, 2021) to analyze the syntactic structures (e.g., Hewitt and Manning (2019); Wu et al (2020)), relations (Papanikolaou et al, 2019), ontologies (Michael et al, 2020) and, to a more limited extend, discourse related behaviour (Zhu et al, 2020;Koto et al, 2021a;Pandia et al, 2021).…”
Section: Introductionmentioning
confidence: 99%
“…Following this line, many subsequent papers queried the amount of knowledge from various parts of neural models. These included syntaxrelated (Lakretz et al, 2019;Hewitt and Manning, 2019), semantic-related (Tenney et al, 2019), and discourse-related information Koto et al, 2021). Towards developing reliable probing methods, several papers proposed control mechanisms (Pimentel et al, 2020;.…”
Section: Related Workmentioning
confidence: 99%