Vera: Prediction Techniques for Reducing Harmful Misinformation in Consumer Health Search

Pradeep, Ronak; Ma, Xueguang; Nogueira, Rodrigo; Lin, Jimmy

doi:10.1145/3404835.3463120

Cited by 18 publications

(19 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…By construction, the veracity of each claim is determined by the (candidate) supporting sentences, taken together. One simple and popular approach to fact extraction and verification is to consider the veracity of the claim with respect to each candidate independently (i.e., classification), and then aggregate the evidence (Hanselowski et al, 2018;Zhou et al, 2019;Soleimani et al, 2019;Pradeep et al, 2021b). For convenience, we refer to these as "pointwise approaches", borrowing from the learning to rank literature (Li, 2011).…”

Section: Background and Related Workmentioning

confidence: 99%

Exploring Listwise Evidence Reasoning with T5 for Fact Verification

Jiang¹,

Pradeep²,

Lin³

2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

Self Cite

View full text Add to dashboard Cite

This work explores a framework for fact verification that leverages pretrained sequence-tosequence transformer models for sentence selection and label prediction, two key sub-tasks in fact verification. Most notably, improving on previous pointwise aggregation approaches for label prediction, we take advantage of T5 using a listwise approach coupled with data augmentation. With this enhancement, we observe that our label prediction stage is more robust to noise and capable of verifying complex claims by jointly reasoning over multiple pieces of evidence. Experimental results on the FEVER task show that our system attains a FEVER score of 75.87% on the blind test set. This puts our approach atop the competitive FEVER leaderboard at the time of our work, scoring higher than the second place submission by almost two points in label accuracy and over one point in FEVER score.

show abstract

Section: Background and Related Workmentioning

confidence: 99%

Exploring Listwise Evidence Reasoning with T5 for Fact Verification

Jiang¹,

Pradeep²,

Lin³

2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

Self Cite

View full text Add to dashboard Cite

show abstract

“…Inspired by the Vera system from Pradeep et al [20], we fine-tune the pre-trained T5 language model [21] to detect stances. We formulate this task as a binary classification task: given the health topic and a relevant document, the model aims to detect the document's stance towards the treatment of the health issue, i.e., whether or not the document supports the use of the treatment.…”

Section: Stance Detection Model (Sdm)mentioning

confidence: 99%

“…To obtain binary classification scores, we use an approach similar to Pradeep et al [20]. Specifically, we apply a softmax function on the logits of the "favor" and "against" found in T5's first generated token.…”

Section: Stance Detection Model (Sdm)mentioning

confidence: 99%

“…If people make incorrect decisions with regard to their health queries, these decisions may have a serious negative impact on their lives. Approaches to reducing the rate at which people make incorrect decisions include changes to the search process [14], alerting users to bias in results [10], providing answers directly [7,12] and the ranking of search results [20]. In this paper, we focus on the latter approach, i.e., ranking correct information before incorrect information.…”

Section: Introductionmentioning

confidence: 99%

“…When a ranker is given the correct answer to a TREC Health Misinformation search topic, Pradeep et al [20] have shown that using T5 sequence-to-sequence models to determine documents' stances and to rerank documents can produce superior results. Unfortunately, Pradeep et al 's method lacks a way to automatically determine the correct answer, which limits the approach.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Learning Trustworthy Web Sources to Derive Correct Answers and Reduce Health Misinformation in Search

Zhang

Tahami

Abualsaud

et al. 2022

Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

View full text Add to dashboard Cite

When searching the web for answers to health questions, people can make incorrect decisions that have a negative effect on their lives if the search results contain misinformation. To reduce health misinformation in search results, we need to be able to detect documents with correct answers and promote them over documents containing misinformation. Determining the correct answer has been a difficult hurdle to overcome for participants in the TREC Health Misinformation Track. In the 2021 track, automatic runs were not allowed to use the known answer to a topic's health question, and as a result, the top automatic run had a compatibility-difference score of 0.043 while the top manual run, which used the known answer, had a score of 0.259. The compatibility-difference measures the ability of methods to rank correct and credible documents before incorrect and non-credible documents. By using an existing set of health questions and their known answers, we show it is possible to learn which web hosts are trustworthy, from which we can predict the correct answers to the 2021 health questions with an accuracy of 76%. Using our predicted answers, we can promote documents that we predict contain this answer and achieve a compatibility-difference score of 0.129, which is a three-fold increase in performance over the best previous automatic method. CCS CONCEPTS• Information systems → Retrieval models and ranking.

show abstract