Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency 2021
DOI: 10.1145/3442188.3445916
|View full text |Cite
|
Sign up to set email alerts
|

Censorship of Online Encyclopedias

Abstract: While artificial intelligence provides the backbone for many tools people use around the world, recent work has brought to attention that the algorithms powering AI are not free of politics, stereotypes, and bias. While most work in this area has focused on the ways in which AI can exacerbate existing inequalities and discrimination, very little work has studied how governments actively shape training data. We describe how censorship has affected the development of Wikipedia corpuses, text data which are regul… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 37 publications
0
4
0
Order By: Relevance
“…Increased surveillance or censorship may amplify existing feedback loops such as "chilling effects", whereby the anticipation of surveillance leads individuals to self-censor (Kwon et al, 2015). In a distinct feedback loop, censorship of web text, for example of online encyclopedias, can then affect the quality of a LM trained on such data (Yang and Roberts, 2021).…”
Section: Problemmentioning
confidence: 99%
“…Increased surveillance or censorship may amplify existing feedback loops such as "chilling effects", whereby the anticipation of surveillance leads individuals to self-censor (Kwon et al, 2015). In a distinct feedback loop, censorship of web text, for example of online encyclopedias, can then affect the quality of a LM trained on such data (Yang and Roberts, 2021).…”
Section: Problemmentioning
confidence: 99%
“…In attempting to update Wikipedia pages from China, it was discovered that the VPN IP address was blocked by Wikipedia for editing, probably due to previous bad editing by others on the same IP address (see Figure 3). Thus, a combination of the Great Firewall of China due to Chinese censorship (Harrison 2019;Yang & Roberts 2021) and IP address blocking makes both reading and writing of pages problematic. Reading pages can be overcome using VPN access but writing or updating pages proved more difficult.…”
Section: Wikipedia Pagesmentioning
confidence: 99%
“…Chouldechova et al audit an algorithm-assisted child maltreatment hotline screening system and identify many of the challenges in implementing such an investigation in practice [16]. Yang et al demonstrate how political censorship of Wikipedia can affect pre-trained models used for general domain NLP algorithms [73]. Bender and Gebru et al critically examine the environmental and financial costs first of large language models and offer some recommendations for curating and documenting datasets more carefully [7].…”
Section: Background and Related Work 21 Accountable Algorithmsmentioning
confidence: 99%