2017
DOI: 10.1098/rsos.171217
|View full text |Cite
|
Sign up to set email alerts
|

Robust clustering of languages across Wikipedia growth

Abstract: Wikipedia is the largest existing knowledge repository that is growing on a genuine crowdsourcing support. While the English Wikipedia is the most extensive and the most researched one with over 5 million articles, comparatively little is known about the behaviour and growth of the remaining 283 smaller Wikipedias, the smallest of which, Afar, has only one article. Here, we use a subset of these data, consisting of 14 962 different articles, each of which exists in 26 different languages, from Arabic to Ukrain… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 20 publications
(13 citation statements)
references
References 52 publications
0
13
0
Order By: Relevance
“…Other research groups also start to apply Wikipedia ranking in Wikiometrics [23]. We also note the growing interest to scientific analysis of several language editions of Wikipedia [24].…”
Section: Introductionmentioning
confidence: 98%
“…Other research groups also start to apply Wikipedia ranking in Wikiometrics [23]. We also note the growing interest to scientific analysis of several language editions of Wikipedia [24].…”
Section: Introductionmentioning
confidence: 98%
“…We also intend to explore the use of neural language models to generate context embeddings in order to improve the quality of the context representation. Finally, we intend to integrate our methods with other natural language processing tasks [4,5,7,14,53,49] that might benefit from representing words as context embeddings.…”
Section: Resultsmentioning
confidence: 99%
“…In recent years, information from Wikipedia has been used in various scientific disciplines, such as media studies, the geography of information, computer science, computational linguistics or social physics. It has been used to study social phenomena as diverse as the structure of relations between languages (Eom et al 2015;Ronen et al 2014;Ban, Perc, and Levnajić 2017), geopolitical instability (Apic, Betts, and Russell 2011) or the similarity of interests between countries (Karimi et al 2015). It has also been used to analyze historical processes, such as large demographic trends (Reznik and Shatalov 2016), birth-death migration trajectories (Schich et al 2014), movements of famous people during their lives (Menini et al 2017), or the influence of the media on collective memory (Jara-Figueroa, Yu, and Hidalgo 2019).…”
Section: The Biographical Record On Wikipedia and Its Reflexive Potenmentioning
confidence: 99%