Proceedings of the Fourth International Conference on Communities and Technologies 2009
DOI: 10.1145/1556460.1556463
|View full text |Cite
|
Sign up to set email alerts
|

Measuring self-focus bias in community-maintained knowledge repositories

Abstract: Self-focus is a novel way of understanding a type of bias in community-maintained Web 2.0 graph structures. It goes beyond previous measures of topical coverage bias by encapsulating both node-and edge-hosted biases in a single holistic measure of an entire community-maintained graph. We outline two methods to quantify self-focus, one of which is very computationally inexpensive, and present empirical evidence for the existence of self-focus using a "hyperlingual" approach that examines 15 different language e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

4
92
0

Year Published

2009
2009
2022
2022

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 87 publications
(97 citation statements)
references
References 12 publications
4
92
0
Order By: Relevance
“…In particular, as our choice of sampling strategy, we focused on those that represent urban elements commonly interpreted as leisure POIs, such as cafes, restaurants, pubs and bars. These are the categories that are most common to mobile applications such as MyCityWay, 6 Google HotPot 7 and Foursquare, 8 which are used by city dwellers to navigate the urban landscape. To ensure we are considering genuine crowd-sourcing contributions, and not those made by bots (i.e., mass imports), we have eliminated from the dataset those users who performed an excessive number of edits in a very short time (i.e., those who edited more than 40 POIs in a single changeset session in OSM, with the threshold of 40 chosen after manual inspection of the per-user edit distribution).…”
Section: Research Methodology Dataset Descriptionmentioning
confidence: 99%
See 1 more Smart Citation
“…In particular, as our choice of sampling strategy, we focused on those that represent urban elements commonly interpreted as leisure POIs, such as cafes, restaurants, pubs and bars. These are the categories that are most common to mobile applications such as MyCityWay, 6 Google HotPot 7 and Foursquare, 8 which are used by city dwellers to navigate the urban landscape. To ensure we are considering genuine crowd-sourcing contributions, and not those made by bots (i.e., mass imports), we have eliminated from the dataset those users who performed an excessive number of edits in a very short time (i.e., those who edited more than 40 POIs in a single changeset session in OSM, with the threshold of 40 chosen after manual inspection of the per-user edit distribution).…”
Section: Research Methodology Dataset Descriptionmentioning
confidence: 99%
“…For example, [10] studied gender imbalance in Wikipedia, and reported on how topics of particular interest to females were substantially less covered than topics of specific interest to males. In [8], the indegree summation (i.e., number of inlinks per article in the Wikipedia Article Graph) on 15 different language editions of Wikipedia was analysed; their findings suggest that population is not the most important factor to be considered, and other factors such as fluency in languages are more strongly correlated with indegree instead. They conclude that, when developing technologies to rely upon community maintained repositories, contextual factors of the contributors, such as language and culture, must be carefully examined.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Moreover it has been found that each language Wikipedia exhibits a self-focus bias towards articles about regions where that language is largely spoken [12].…”
Section: Language Wikipedia Communities and Their Points Of Viewmentioning
confidence: 99%
“…Previous research on inter-language analysis of Wikipedia articles mainly studied the geographic focus [2,3], famous and prominent people [4][5][6], historical figures [7], editing behavior [1], user interaction [8], and article structure [9] of Wikipedia editions in different languages. Additionally, concept overlap [10], inter-language links [11] between Wikipedia versions in multiple languages have been examined.…”
Section: Introductionmentioning
confidence: 99%