Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2022
DOI: 10.18653/v1/2022.naacl-main.170
|View full text |Cite
|
Sign up to set email alerts
|

Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language Models

Abstract: With a growing number of BERTology works analyzing different components of pre-trained language models, we extend this line of research through an in-depth analysis of discourse information in pre-trained and finetuned language models. We move beyond prior work along three dimensions: First, we describe a novel approach to infer discourse structures from arbitrarily long documents. Second, we propose a new type of analysis to explore where and how accurately intrinsic discourse is captured in the BERT and BART… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 37 publications
0
4
0
Order By: Relevance
“…We explore both vanilla and fine-tuned PLMs, as they were both shown to contain discourse information for monologues (Huber and Carenini, 2022).…”
Section: Which Kinds Of Plms To Use?mentioning
confidence: 99%
See 2 more Smart Citations
“…We explore both vanilla and fine-tuned PLMs, as they were both shown to contain discourse information for monologues (Huber and Carenini, 2022).…”
Section: Which Kinds Of Plms To Use?mentioning
confidence: 99%
“…3.4 How to find the best heads? Xiao et al (2021) and Huber and Carenini (2022) showed that discourse information is not evenly distributed between heads and layers. However, they do not provide a strategy to select the head(s) containing most discourse information.…”
Section: How To Derive Trees From Attention Heads?mentioning
confidence: 99%
See 1 more Smart Citation
“…We extend the use of such methods to take into account structural elements. Although some studies have recently investigated how structural / discourse information is encoded in pretrained languages models (Wu et al, 2020;Huber and Carenini, 2022), to the best of our knowledge, we are the first to explore textual explainable methods not relying only on surface form information. This is crucial for long texts, as methods such as LIME (Ribeiro et al, 2016) that rely on sampling word perturbations can become expensive for high token counts.…”
Section: Related Workmentioning
confidence: 99%