2022
DOI: 10.1162/tacl_a_00453
|View full text |Cite
|
Sign up to set email alerts
|

SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

Abstract: In the summarization domain, a key requirement for summaries is to be factually consistent with the input document. Previous work has found that natural language inference (NLI) models do not perform competitively when applied to inconsistency detection. In this work, we revisit the use of NLI for inconsistency detection, finding that past work suffered from a mismatch in input granularity between NLI datasets (sentence-level), and inconsistency detection (document level). We provide a highly effective and lig… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
81
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 93 publications
(148 citation statements)
references
References 34 publications
0
81
0
Order By: Relevance
“…Metric comparisons, however, were often conducted on isolated datasets. Laban et al (2021) unify work in entailment-based metrics for factual consistency, showing the effect of granularity, base models, and other hyperparameter choices. This work also proposes a learned metric built on top of the output of an entailment model, with parameters fine-tuned on synthetic data.…”
Section: Related Workmentioning
confidence: 88%
See 4 more Smart Citations
“…Metric comparisons, however, were often conducted on isolated datasets. Laban et al (2021) unify work in entailment-based metrics for factual consistency, showing the effect of granularity, base models, and other hyperparameter choices. This work also proposes a learned metric built on top of the output of an entailment model, with parameters fine-tuned on synthetic data.…”
Section: Related Workmentioning
confidence: 88%
“…For example, some work reports entailment-based metrics as performing best (Koto et al, 2020;Maynez et al, 2020), while other work argues for QA metrics (Durmus et al, 2020;Wang et al, 2020b;Scialom et al, 2021). Recently, Laban et al (2021) proposed a benchmark called SummaC to compare metrics across six factual consistency datasets for the task of binary factual consistency classification, whether a summary is entirely factually consistent or not. This work unifies prior work on entailment-based metrics by studying the effect of input granularity, pretrained entailment model, and other hyperparameter choices on downstream evaluation performance.…”
Section: Entailment Matrixmentioning
confidence: 99%
See 3 more Smart Citations