2021
DOI: 10.48550/arxiv.2104.13346
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics

Abstract: Modern summarization models generate highly fluent but often factually unreliable outputs. This motivated a surge of metrics attempting to measure the factuality of automatically generated summaries. Due to the lack of common benchmarks, these metrics cannot be compared. Moreover, all these methods treat factuality as a binary concept and fail to provide deeper insights on the kinds of inconsistencies made by different systems. To address these limitations, we devise a typology of factual errors and use it to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(16 citation statements)
references
References 22 publications
0
16
0
Order By: Relevance
“…In the work of Pagnoni et al (2021), factual errors shaped as named entities and noun phrases account for the majority of all factual errors, while both are more in line with the task of cloze in our given scheme. As a result, we define named entities and noun phrases as factual factors in our approach.…”
Section: Factual Factorsmentioning
confidence: 56%
See 3 more Smart Citations
“…In the work of Pagnoni et al (2021), factual errors shaped as named entities and noun phrases account for the majority of all factual errors, while both are more in line with the task of cloze in our given scheme. As a result, we define named entities and noun phrases as factual factors in our approach.…”
Section: Factual Factorsmentioning
confidence: 56%
“…In contrast, Gabriel et al (2020); Pagnoni et al (2021) proposed the meta-evaluation of factual consistency with more refined criteria for the performance evaluation of factual consistency metrics.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…A system capable of producing accurate summaries of the medical evidence on any given topic could dramatically improve the ability of caregivers to consult the whole of the evidence base to inform care. However, current neural summarization systems are prone to inserting inaccuracies into outputs (Kryscinski et al, 2020;Maynez et al, 2020;Pagnoni et al, 2021;Ladhak et al, 2021;Choubey et al, 2021). This has been shown specifically to be a problem in the context of medical literature summarization Otmakhova et al, 2022), where there is a heightened need for factual accuracy.…”
Section: Introductionmentioning
confidence: 99%