Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-short.61
|View full text |Cite
|
Sign up to set email alerts
|

Zero-shot Fact Verification by Claim Generation

Abstract: Neural models for automated fact verification have achieved promising results thanks to the availability of large, human-annotated datasets. However, for each new domain that requires fact verification, creating a dataset by manually writing claims and linking them to their supporting evidence is expensive. We develop QACG, a framework for training a robust fact verification model by using automaticallygenerated claims that can be supported, refuted, or unverifiable from evidence from Wikipedia. QACG generates… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 28 publications
(31 citation statements)
references
References 21 publications
0
31
0
Order By: Relevance
“…Automatically generated texts can be used to generate claim-evidence pairs for each label category. To this end, Pan et al (2021a) employ a two-step approach to generate synthetic data for fact verification. In the first step of question generation, given the evidence and an answer, a BART model, fine-tuned on the SQuAD dataset using the similar input-output format, generates a question for that answer.…”
Section: Fact Verificationmentioning
confidence: 99%
“…Automatically generated texts can be used to generate claim-evidence pairs for each label category. To this end, Pan et al (2021a) employ a two-step approach to generate synthetic data for fact verification. In the first step of question generation, given the evidence and an answer, a BART model, fine-tuned on the SQuAD dataset using the similar input-output format, generates a question for that answer.…”
Section: Fact Verificationmentioning
confidence: 99%
“…Recently considered using the information stored in the weights of a large pretrained language model -BERT (Devlin et al, 2019) -as the only source of evidence, as it has been shown competitive in knowledge base completion (Petroni et al, 2019). However, without explicitly considering evidence, such approaches are likely to propagate biases learned from the data they were trained on, and render justification generation impossible (Lee et al, 2021;Pan et al, 2021).…”
Section: Evidence Retrieval and Claim Verificationmentioning
confidence: 99%
“…Global Distractor For each English riddle, we use the correct answer as the query and retrieve the top-2 most similar words or phrases from the pretrained Sense2Vec (Trask, Michalak, and Liu 2015;Pan et al 2021). For each Chinese riddle, we use the word embedding provided by Song et al (2018).…”
Section: Distractor Generationmentioning
confidence: 99%
“…Invalid Distractor Replacement To avoid the case that the generated distractor becomes another correct answer or repeats with other distractors, we define rules to ensure that the distractor has less lexical overlap with the correct answer (Pan et al 2021) and other distractors of the same riddle. Specifically, two candidates are considered as lexically overlapped if they share at least a same word in English (e.g., "horse" and "wild horse") or a same character in Chinese (e.g., "蘑菇" and "平菇").…”
Section: Distractor Generationmentioning
confidence: 99%