Findings of the Association for Computational Linguistics: EMNLP 2020 2020
DOI: 10.18653/v1/2020.findings-emnlp.22
|View full text |Cite
|
Sign up to set email alerts
|

How Decoding Strategies Affect the Verifiability of Generated Text

Abstract: Recent progress in pre-trained language models led to systems that are able to generate text of an increasingly high quality. While several works have investigated the fluency and grammatical correctness of such models, it is still unclear to which extent the generated text is consistent with factual world knowledge. Here, we go beyond fluency and also investigate the verifiability of text generated by stateof-the-art pre-trained language models. A generated sentence is verifiable if it can be corroborated or … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
27
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 32 publications
(28 citation statements)
references
References 22 publications
1
27
0
Order By: Relevance
“…While our methods have achieved higher human ratings of engagingness and humanness, our models still have numerous issues. Firstly, even our best models still make mistakes: they i) contradict or repeat themselves on occasion, ii) tend to repeat the same phrases in separate conversations, and iii) hallucinate knowledge as seen in other generative systems (Massarelli et al, 2019). Each of these faults naturally leads to future research directions; we made some attempt here using unlikelihood (Li et al, 2019a) and conditioning on knowledge (Dinan et al, 2019c), but more needs to be done.…”
Section: Discussionmentioning
confidence: 99%
“…While our methods have achieved higher human ratings of engagingness and humanness, our models still have numerous issues. Firstly, even our best models still make mistakes: they i) contradict or repeat themselves on occasion, ii) tend to repeat the same phrases in separate conversations, and iii) hallucinate knowledge as seen in other generative systems (Massarelli et al, 2019). Each of these faults naturally leads to future research directions; we made some attempt here using unlikelihood (Li et al, 2019a) and conditioning on knowledge (Dinan et al, 2019c), but more needs to be done.…”
Section: Discussionmentioning
confidence: 99%
“…, x t−1 ) >= p. Thus, the number of candidate tokens considered varies dynamically depending on the context, and the resulting text is reasonably natural with less repetitions. Recently, Massarelli et al, (2020) show that top-k and top-p sampler tend to generate more nonfactual sentences, as corroborated by Wikipedia.…”
Section: Generating Text From Tgmmentioning
confidence: 72%
“…Prior work concerning evaluation of automatic metrics and human evaluation for NLG systems has mainly focused on general analysis of output quality or coherence and fluency (Callison-Burch et al, 2007;Graham, 2015;Fabbri et al, 2021), rather than factuality. Recent efforts by NLP researchers have drawn attention to the issue of factual errors and hallucinations in the output of neural summarization models (Cao et al, 2018;Massarelli et al, 2019;Zhao et al, 2020;Falke et al, 2019b;Goodrich et al, 2019;Celikyilmaz et al, 2020).…”
Section: Related Workmentioning
confidence: 99%