In the Social Web scenario, where large amounts of User Generated Content diffuse through Social Media, the risk of running into misinformation is not negligible. For this reason, assessing and mining the credibility of both sources of information and information itself constitute nowadays a fundamental issue. Credibility, also referred as believability, is a quality perceived by individuals, who are not always able to discern with their cognitive capacities genuine information from the fake one. For this reason, in the recent years several approaches have been proposed to automatically assess credibility in Social Media. Most of them are based on data‐driven models, i.e., they employ machine‐learning techniques to identify misinformation, but recently also model‐driven approaches are emerging, as well as graph‐based approaches focusing on credibility propagation. Since multiple social applications have been developed for different aims and in different contexts, several solutions have been considered to address the issue of credibility assessment in Social Media. Three of the main tasks facing this issue and considered in this article concern: (1) the detection of opinion spam in review sites, (2) the detection of fake news and spam in microblogging, and (3) the credibility assessment of online health information. Despite the high number of interesting solutions proposed in the literature to tackle the above three tasks, some issues remain unsolved; they mainly concern both the absence of predefined benchmarks and gold standard datasets, and the difficulty of collecting and mining large amount of data, which has not yet received the attention it deserves. WIREs Data Mining Knowl Discov 2017, 7:e1209. doi: 10.1002/widm.1209
This article is categorized under:
Algorithmic Development > Web Mining
Application Areas > Science and Technology
Technologies > Machine Learning