Findings of the Association for Computational Linguistics: EMNLP 2020 2020
DOI: 10.18653/v1/2020.findings-emnlp.43
|View full text |Cite
|
Sign up to set email alerts
|

Claim Check-Worthiness Detection as Positive Unlabelled Learning

Abstract: As the first step of automatic fact checking, claim check-worthiness detection is a critical component of fact checking systems. There are multiple lines of research which study this problem: check-worthiness ranking from political speeches and debates, rumour detection on Twitter, and citation needed detection from Wikipedia. To date, there has been no structured comparison of these various tasks to understand their relatedness, and no investigation into whether or not a unified approach to all of them is ach… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
3
1

Relationship

4
5

Authors

Journals

citations
Cited by 24 publications
(21 citation statements)
references
References 37 publications
0
21
0
Order By: Relevance
“…Fact checking systems consist of components to identify check-worthy claims (Atanasova et al, 2018;Wright and Augenstein, 2020), retrieve and rank evidence documents (Yin and Roth, 2018;Allein et al, 2020), determine the relationship between claims and evidence documents (Bowman et al, 2015;Augenstein et al, 2016;Baly et al, 2018), and finally predict the claims' veracity (Thorne et al, 2018;Augenstein et al, 2019). As this is a relatively involved task, models easily overfit to shallow textual patterns, necessitating the need for adversarial examples to evaluate the limits of their performance.…”
Section: Fact Checkingmentioning
confidence: 99%
“…Fact checking systems consist of components to identify check-worthy claims (Atanasova et al, 2018;Wright and Augenstein, 2020), retrieve and rank evidence documents (Yin and Roth, 2018;Allein et al, 2020), determine the relationship between claims and evidence documents (Bowman et al, 2015;Augenstein et al, 2016;Baly et al, 2018), and finally predict the claims' veracity (Thorne et al, 2018;Augenstein et al, 2019). As this is a relatively involved task, models easily overfit to shallow textual patterns, necessitating the need for adversarial examples to evaluate the limits of their performance.…”
Section: Fact Checkingmentioning
confidence: 99%
“…We find that the best performance can be achieved by a Longformer-based model (Beltagy et al, 2020), which encodes entire paragraphs in papers and jointly predicts cite-worthiness labels for each of the sentences contained in the paragraph. Additional gains in recall can be achieved by using positive unlabelled learning, as documented in Wright and Augenstein (2020a) for the related task Exaggerated Claims Press Release: Players of the game rock paper scissors subconsciously copy each other's hand shapes, significantly increasing the chance of the game ending in a draw, according to new research.…”
Section: Methods For Cite-worthiness Detectionmentioning
confidence: 99%
“…CITEWORTH consists of 1.2M sentences, balanced across 10 diverse scientific fields. While others have studied this task for few and/or narrow domains (Sugiyama et al, 2010;Färber et al, 2018), and have also studied very related tasks, such as claim check-worthiness detection (Wright and Augenstein, 2020a) or citation recommendation (Jürgens et al, 2018), this is the largest and most diverse dataset for this task to date.…”
Section: Cite-worthiness Detectionmentioning
confidence: 99%
“…SciBERT + PU Learning We experiment with SciBERT trained using positive-unlabelled (PU) learning (Elkan and Noto, 2008) which has been shown to significantly improve performance on citation needed detection in Wikipedia and rumour detection on Twitter (Wright and Augenstein, 2020a). The intuition behind PU learning is to assume that cite-worthy data is labelled and noncite-worthy data is unlabelled, containing some cite-worthy examples.…”
Section: Manual Evaluationmentioning
confidence: 99%