2011
DOI: 10.1007/978-3-642-19437-5_23
|View full text |Cite
|
Sign up to set email alerts
|

Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features

Abstract: Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7% are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content. In this work, we present the results of an effort to integrate three of the leading approaches to Wikipedia vandalism detection: a spatiotemporal analysis of metadata (STiki), a reputation-based system (Wiki-Trust), and natural language processing features. The performanc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
100
2
1

Year Published

2012
2012
2022
2022

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 112 publications
(105 citation statements)
references
References 11 publications
2
100
2
1
Order By: Relevance
“…WikiTrust was also focused to identify vandalized encyclopedic articles. Here the acts of vandalism are identified by the activity of anonyms (users acting without registration) and new users, who are just registered [16,17].…”
Section: Related Workmentioning
confidence: 99%
“…WikiTrust was also focused to identify vandalized encyclopedic articles. Here the acts of vandalism are identified by the activity of anonyms (users acting without registration) and new users, who are just registered [16,17].…”
Section: Related Workmentioning
confidence: 99%
“…Algorithms of the kind calculate the probabilities for each edit that it is actually vandalistic. Four varieties have been developed so far (Adler et al 2011). As far as content is concerned, they may focus on language features (e.g., bad words, pronoun frequencies), or on language-independent textual features (e.g., use of capitals, changes to numerical content, deletion of text).…”
Section: Wikipedia: Algorithmsmentioning
confidence: 99%
“…Research in this area has crystallized into four categories of algorithm (Adler et al 2011). Algorithms may focus on language features (like bad words, pronoun frequencies), on language-independent textual features (like the use of capitals, changes to numerical content, deletion of text), on metadata of edits (like time and place of the edit, anonymity, absence of revision comment), or on the editor's reputation as a trustworthy contributor.…”
Section: Anti-vandalism Tools In Wikipediamentioning
confidence: 99%
“…Computer scientists have been struggling with the question which approach to the detection of vandalism is the most fruitful. Based on a computer tournament with all approaches in the competition the provisional answer seems to be: a combination of all four algorithms works best (Adler et al 2011).…”
Section: Anti-vandalism Tools In Wikipediamentioning
confidence: 99%