Findings of the Association for Computational Linguistics: EMNLP 2020 2020
DOI: 10.18653/v1/2020.findings-emnlp.309
|View full text |Cite
|
Sign up to set email alerts
|

HoVer: A Dataset for Many-Hop Fact Extraction And Claim Verification

Abstract: We introduce HOVER (HOppy VERification), a dataset for many-hop evidence extraction and fact verification. It challenges models to extract facts from several Wikipedia articles that are relevant to a claim and classify whether the claim is SUPPORTED or NOT-SUPPORTED by the facts. In HOVER, the claims require evidence to be extracted from as many as four English Wikipedia articles and embody reasoning graphs of diverse shapes. Moreover, most of the 3/4-hop claims are written in multiple sentences, which adds to… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
73
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 62 publications
(77 citation statements)
references
References 30 publications
4
73
0
Order By: Relevance
“…After selecting salient words from the true claims for replacement, we need to provide only paraphrases that are opposite in meaning and consider the context in which these words occur. Language models have been used previously for infilling tasks (Donahue et al, 2020) and have also been used for automatic claim mutation in fact checking (Jiang et al, 2020). Inspired by these approaches, we use the Masked Language Model (MLM) RoBERTa (Liu et al, 2019) fine-tuned on CORD-19 (Wang et al, 2020) for infilling.…”
Section: Masked Language Model Infilling With Entailment-based Quality Controlmentioning
confidence: 99%
“…After selecting salient words from the true claims for replacement, we need to provide only paraphrases that are opposite in meaning and consider the context in which these words occur. Language models have been used previously for infilling tasks (Donahue et al, 2020) and have also been used for automatic claim mutation in fact checking (Jiang et al, 2020). Inspired by these approaches, we use the Masked Language Model (MLM) RoBERTa (Liu et al, 2019) fine-tuned on CORD-19 (Wang et al, 2020) for infilling.…”
Section: Masked Language Model Infilling With Entailment-based Quality Controlmentioning
confidence: 99%
“…Fact Checking and Verification. The need for claim verification has led to annotated fact check-ing datasets (Thorne et al, 2018;Baly et al, 2018;Augenstein et al, 2019;Jiang et al, 2020;Wadden et al, 2020;. Recent works deploy adversarial attacks against fact checking systems ( Thorne et al, 2019a,b;Niewinski et al, 2019;Atanasova et al, 2020b) and attempt to improve the system through generation (Atanasova et al, 2020a;Goyal and Durrett, 2020;Fan et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Augenstein et al (2019) collect claims on fact checking websites and release the MultiFC dataset. Jiang et al (2020) collect a dataset requiring many-hop evidence extraction from Wikipedia. Wadden et al (2020) collect a dataset of scientific claims to be verified.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Given only a single statement and a single sentence, this decision process is called recognizing textual entailment (Dagan et al, 2010, RTE) or natural language inference (Bowman et al, 2015;Williams et al, 2018, NLI). Given a single statement and a vast pool of possible evidence (e.g., all of Wikipedia), this problem is called verification (Thorne et al, 2018;Jiang et al, 2020). In stages 1 to 4, players write challenging claims either entailed or refuted by evidence from Wikipedia (Section 3.1).…”
Section: Introducing a Game Of Challenging Claimsmentioning
confidence: 99%