Proceedings of the 2nd Workshop on New Frontiers in Summarization 2019
DOI: 10.18653/v1/d19-5410
|View full text |Cite
|
Sign up to set email alerts
|

A Closer Look at Data Bias in Neural Extractive Summarization Models

Abstract: In this paper, we take stock of the current state of summarization datasets and explore how different factors of datasets influence the generalization behaviour of neural extractive summarization models. Specifically, we first propose several properties of datasets, which matter for the generalization of summarization models. Then we build the connection between priors residing in datasets and model designs, analyzing how different properties of datasets influence the choices of model structure design and trai… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
4
1

Relationship

2
8

Authors

Journals

citations
Cited by 39 publications
(18 citation statements)
references
References 27 publications
0
18
0
Order By: Relevance
“…Multi-domain Setting Previous datasets are specified to one domain. However, the model trained on the summarization data of a single domain usually has poor generalization ability (Wang et al, 2019;Zhong et al, 2019b;. Therefore, QMSum contains meetings across multiple domains: Product, Academic and Committee meetings.…”
Section: Number Of Meetings and Summariesmentioning
confidence: 99%
“…Multi-domain Setting Previous datasets are specified to one domain. However, the model trained on the summarization data of a single domain usually has poor generalization ability (Wang et al, 2019;Zhong et al, 2019b;. Therefore, QMSum contains meetings across multiple domains: Product, Academic and Committee meetings.…”
Section: Number Of Meetings and Summariesmentioning
confidence: 99%
“…Extractive Document Summarization With the development of neural networks, great progress has been made in extractive document summarization. Most of them focus on the encoderdecoder framework and use recurrent neural networks (Cheng and Lapata, 2016;Nallapati et al, 2017; or Transformer encoders (Zhong et al, 2019b;Wang et al, 2019a) for the sentential encoding. Recently, pre-trained language models are also applied in summarization for contextual word representations (Zhong et al, 2019a;Liu and Lapata, 2019b;Zhong et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Recent research work on extractive summarization spans a large range of approaches. These works usually instantiate their encoder-decoder architecture by choosing RNN (Nallapati et al, 2017;Zhou et al, 2018), Transformer (Wang et al, 2019;Zhong et al, 2019b;Zhang et al, 2019b) or GNN Jia et al, 2020b) as encoder, autoregressive (Jadhav and Rajan, 2018; or RL-based (Narayan et al, 2018;Arumae and Liu, 2018;Luo et al, 2019) decoders. For two-stage summarization, Chen and Bansal (2018) and Bae et al (2019) follow a hybrid extract-then-rewrite architecture, with policy-based RL to bridge the two networks together.…”
Section: Extractive Summarizationmentioning
confidence: 99%