Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022
DOI: 10.18653/v1/2022.acl-long.207
|View full text |Cite
|
Sign up to set email alerts
|

BRIO: Bringing Order to Abstractive Summarization

Abstract: ive summarization models are commonly trained using maximum likelihood estimation, which assumes a deterministic (onepoint) target distribution in which an ideal model will assign all the probability mass to the reference summary. This assumption may lead to performance degradation during inference, where the model needs to compare several system-generated (candidate) summaries that have deviated from the reference summary. To address this problem, we propose a novel training paradigm which assumes a non-deter… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
66
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
2
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 106 publications
(66 citation statements)
references
References 34 publications
0
66
0
Order By: Relevance
“…The benefit of two-phase summarization mainly lies in the decoding strategy by beam search and other superior methods such as nucleus sampling [52], where diverse data creates more opportunities to search for an "ideal" candidate. In one of the most recent works, Liu et al [53] followed a new training paradigm that assigns probabilities of candidate summaries concerning to their quality in contrastive learning. So far, some other summarization works have been done with contrastive learning in different methods [54,55,56,57,58].…”
Section: Deep Learning Approaches For Text Summarizationmentioning
confidence: 99%
“…The benefit of two-phase summarization mainly lies in the decoding strategy by beam search and other superior methods such as nucleus sampling [52], where diverse data creates more opportunities to search for an "ideal" candidate. In one of the most recent works, Liu et al [53] followed a new training paradigm that assigns probabilities of candidate summaries concerning to their quality in contrastive learning. So far, some other summarization works have been done with contrastive learning in different methods [54,55,56,57,58].…”
Section: Deep Learning Approaches For Text Summarizationmentioning
confidence: 99%
“…Important sentences are often located at the beginning or end of documents (Baxendale, 1958;Marcu, 1998). This simple heuristic gives strong results on news summarization (Kedzie et al, 2018;Chen and Bansal, 2018;Narayan et al, 2018;Mao et al, 2021;Liu et al, 2022). We take one step further, jointly partitioning a document into multiple sections and estimating sentence salience given their proximity to section boundaries.…”
Section: Related Workmentioning
confidence: 99%
“…Existing second-stage summarization methods design a training objective to improve candidate selection among first-stage candidates, either through a new model (Ravaut et al, 2022), or re-using the first-stage model (Liu et al, 2022b). However, sticking to first-stage candidates may not be ideal as they are bounded by the quality of the first-stage model.…”
Section: Ground Truth Summarymentioning
confidence: 99%
“…SummaReranker (Ravaut et al, 2022) and SimCLS train a RoBERTa to re-rank candidates, the former with multi-label binary cross-entopy, the latter with con-trastive learning and a ranking loss. BRIO (Liu et al, 2022b) re-uses the base model for a secondround of fine-tuning with both the cross-entropy loss and a candidate-level ranking loss. Existing fusion work in summarization focuses on sentence fusion.…”
Section: Related Workmentioning
confidence: 99%