Proceedings of the 20th International Conference Companion on World Wide Web 2011
DOI: 10.1145/1963192.1963349
|View full text |Cite
|
Sign up to set email alerts
|

Wikipedia vandalism detection

Abstract: ii AbstractWikipedia is an online encyclopedia that anyone can edit. The fact that there are almost no restrictions to contributing content is at the core of its success. However, it also attracts pranksters, lobbysts, spammers and other people who degradates Wikipedia's contents. One of the most frequent kind of damage is vandalism, which is defined as any bad faith attempt to damage Wikipedia's integrity.For some years, the Wikipedia community has been fighting vandalism using automatic detection systems. In… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 26 publications
(9 citation statements)
references
References 16 publications
0
8
0
Order By: Relevance
“…[17][18][19][20]). As mentioned by van den Berg et al [14], it could be useful to apply similar approaches, methodologies and technologies, which have already been utilized in other open source projects and Web 2.0 encyclopedias, to detect vandalism in OSM and/or revert unconstructive changes.…”
Section: Introductionmentioning
confidence: 99%
“…[17][18][19][20]). As mentioned by van den Berg et al [14], it could be useful to apply similar approaches, methodologies and technologies, which have already been utilized in other open source projects and Web 2.0 encyclopedias, to detect vandalism in OSM and/or revert unconstructive changes.…”
Section: Introductionmentioning
confidence: 99%
“…• RQ4: The most used method has been the use of classifiers, in machine learning processes, for the detection of acts of vandalism, against a previously established corpus [17]. The analysis that the researchers carry out is the same as outlined by Adler et al [18], and relates to one of the four basic computational approaches: language characteristics, textual content characteristics, metadata relating to publications and the reputation of editors.…”
Section: Discussionmentioning
confidence: 99%
“…The best approaches of that competition were based on timing analysis of revisions [20], language features [16], and user reputation [1]; the three approaches were then unified in [3].…”
Section: Related Workmentioning
confidence: 99%
“…An estimate of contribution quality can be used to flag some contributions for review, as well as for producing initial rankings of new content. For the Wikipedia, there has been a large body of work on automated methods for detecting vandalism and flagging revisions for review [17,18,20,5,16,14]. These methods generally rely on a mix of machine learning and natural language processing; a yearly competition (PAN) compares the performance of such detection methods.…”
Section: Introductionmentioning
confidence: 99%