2021
DOI: 10.1007/978-3-030-72240-1_4
|View full text |Cite
|
Sign up to set email alerts
|

Reliability Prediction for Health-Related Content: A Replicability Study

Abstract: Determining reliability of online data is a challenge that has recently received increasing attention. In particular, unreliable healthrelated content has become pervasive during the COVID-19 pandemic. Previous research [37] has approached this problem with standard classification technology using a set of features that have included linguistic and external variables, among others. In this work, we aim to replicate parts of the study conducted by Sondhi and his colleagues using our own code, and make it availa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 36 publications
0
7
0
Order By: Relevance
“…Normalized count of commercial terms : As illustrated in the literature [ 13 ], the higher the number of commercial terms, the less credible is perceived the related information, due to the for-profit purpose of such information. At a practical level, a list of 45 commercial terms taken from [ 72 ] (such as “ sale ”, “ deal ”, “ ad ”, etc.) has been compiled.…”
Section: Materials and Methodsmentioning
confidence: 99%
“…Normalized count of commercial terms : As illustrated in the literature [ 13 ], the higher the number of commercial terms, the less credible is perceived the related information, due to the for-profit purpose of such information. At a practical level, a list of 45 commercial terms taken from [ 72 ] (such as “ sale ”, “ deal ”, “ ad ”, etc.) has been compiled.…”
Section: Materials and Methodsmentioning
confidence: 99%
“…Two other recent works based on the use of handcrafted features and Machine Learning approaches are those described in [15,25]. In [25], a Logistic Regression model for assessing the reliability of Web pages has been trained on labeled data collected w.r.t.…”
Section: Automated Approachesmentioning
confidence: 99%
“…Textual features are employed in the form of count-based and TF-IDF word vectors. In [15], a replicability study has been conducted on [36], considering two additional datasets made available in [34,39], and ignoring PageRank features, deemed as not suitable for assessing Web content reliability [30].…”
Section: Automated Approachesmentioning
confidence: 99%
See 1 more Smart Citation
“…Technological innovation in the fight against disinformation, as the authors argue, should go beyond discrediting noncredible sources of information and should instead promote more careful information consumption [ 11 ]. The literature has reported on successful machine learning models that classify entire articles or information sources [ 12 , 13 ]. Of note, these models can easily overfit (ie, obtain high classification accuracy for publications from media outlets present in the training set but fail to generalize to previously unseen media outlets).…”
Section: Introductionmentioning
confidence: 99%