Diversity and neocolonialism in Big Data research: Avoiding extractivism while struggling with paternalism

Helm, Paula; de Götzen, Amalia; Cernuzzi, Luca; Hume, Alethia; Diwakar, Shyam; Ruiz Correa, Salvador; Gatica-Perez, Daniel

doi:10.1177/20539517231206802

Cited by 6 publications

(4 citation statements)

References 53 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Despite this potential, critical commentators on AI language technology point out how well-intended research goals such as "technology-based revitalization" regularly misinterpret the needs of local communities (Bender et al, 2021;Bird, 2022). In most cases, native speakers are not involved in the process, or if they are, they are taking on subordinate roles such as commentator, validator, tester, or worse, data extractor (Helm et al, 2023). Instead of cocreating on an equal footing, in many cases the analytical, high-level work is done in technology labs of Western universities or companies, where the languages being studied are often not even understood by the people working on them, let alone the cultures they represent (Arora, 2016).…”

Section: Ethical Concerns With Biases In Language Technologymentioning

confidence: 99%

“…However, many such efforts are based on a vision according to which, with the help of AI, already successfully developed and applied methods and systems that are designed and sought of from an anglo-centric culture of technology development, are one-to-one adopted to other contexts (Bird, 2020;Schwartz, 2022). This approach to bridging the divide leads to a misalignment between the interests and solutions of the former and the living realities of the latter (Helm et al, 2023). Worse, due to general ignorance of the more profound dimensions of linguistic diversity and ultimately the cultural differences that meaningful diversity embodies, major quality problems in the results are neglected, which, as we will show, can result in far-reaching forms of westernized cultural homogenization and epistemic injustice (Spivak, 1988).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Diversity and language technology: how language modeling bias causes epistemic injustice

Helm,

Bella,

Koch

et al. 2024

Ethics Inf Technol

Self Cite

View full text Add to dashboard Cite

It is well known that AI-based language technology—large language models, machine translation systems, multilingual dictionaries, and corpora—is currently limited to three percent of the world’s most widely spoken, financially and politically backed languages. In response, recent efforts have sought to address the “digital language divide” by extending the reach of large language models to “underserved languages.” We show how some of these efforts tend to produce flawed solutions that adhere to a hard-wired representational preference for certain languages, which we call language modeling bias. Language modeling bias is a specific and under-studied form of linguistic bias were language technology by design favors certain languages, dialects, or sociolects with respect to others. We show that language modeling bias can result in systems that, while being precise regarding languages and cultures of dominant powers, are limited in the expression of socio-culturally relevant notions of other communities. We further argue that at the root of this problem lies a systematic tendency of technology developer communities to apply a simplistic understanding of diversity which does not do justice to the more profound differences that languages, and ultimately the communities that speak them, embody. Drawing on the concept of epistemic injustice, we point to the broader ethico-political implications and show how it can lead not only to a disregard for valuable aspects of diversity but also to an under-representation of the needs of marginalized language communities. Finally, we present an alternative socio-technical approach that is designed to tackle some of the analyzed problems.

show abstract

Section: Ethical Concerns With Biases In Language Technologymentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Diversity and language technology: how language modeling bias causes epistemic injustice

Helm,

Bella,

Koch

et al. 2024

Ethics Inf Technol

Self Cite

View full text Add to dashboard Cite

show abstract

“…There are critical questions geographers and other social scientists are already asking to understand where, when and how AI for Good, and ‘climate AI’ in particular, make sense (Alvarez Leon, 2024; Nost & Colven, 2022). Researchers have shown how AI for Good might displace existing regimes of knowledge and expertise in global development work (McDuie‐Ra & Gulson, 2019), greenwash irresponsible corporate behaviour (Espinoza & Aronczyk, 2021) and how Data for Good can perpetuate paternalistic models of development (Helm et al., 2023). Arguably, these are a result of how AI for Good discourse emphasizes its just ends, rather than its means (as might be true of AI projects in general; see Mattern, 2020).…”

Section: Intersectionsmentioning

confidence: 99%

Governing AI, governing climate change?

Nost

2024

Geography and Environment

View full text Add to dashboard Cite

Those concerned with climate governance will want to keep watching what is happening in AI governance. Far from unrelated, the two parallel one another in terms of how fractions of capital—whether within fossil fuel or tech sectors—call for legislating in the face of crisis or for voluntary pledges. In truth, both may be said to be forms of self‐governance. Climate and AI intersect firstly in how they are imagined: dominant climate and AI discourses are both symptoms of Anthropocene thinking and ‘capitalist realism’. They also intersect in as much as ‘AI for Good’ initiatives propose that AI is ethical because it can help to address climate change. What seems missing, however, is any consideration of this climate AI as a procedure—is its knowledge valid, what knowledges does it displace or exclude, what biases are reproduced?—and consideration for its consequences, including harms. Does it actually result in climate mitigation and/or adaptation in a given context? What ‘maladaptive’ outcomes might it drive? What alternatives does it foreclose? These sorts of questions are ones where geographers will continue to have a lot to say.

show abstract

“…Extractivism understood more broadly is a topic that has drawn increasing attention in the literature on data-sharing. For example, both Rodima-Taylor (2024) and Helm et al (2023) identify extractivist practices and logics in the governance and practices of "big data" technologies and knowledge production. Relatedly, an increasing number of case-studies identify epistemically extractivist datagenerating and data-sharing practices across different academic disciplines.…”

Section: Introductionmentioning

confidence: 99%

On Epistemic Extractivism and the Ethics of Data-Sharing

Landström

2024

Philosophy of the Social Sciences

View full text Add to dashboard Cite

In this article I argue that data-sharing risks becoming epistemically extractivist and is a practice sensitive to Linda Martín Alcoff´s challenges for extractivist epistemologies. I situate data-sharing as a socio-epistemic practice that gives rise to ethical and epistemic challenges. I draw on the findings of an institutional ethnography of an international social science research project to identify several ethical and epistemic concerns, including epistemic extractivism. I identify Alcoff’s first and second challenge for extractivist epistemologies in the findings of the empirical investigation and argue that they are important considerations for the ethics and socio-epistemological functioning of data-sharing in social science.

show abstract

Diversity and neocolonialism in Big Data research: Avoiding extractivism while struggling with paternalism

Cited by 6 publications

References 53 publications

Diversity and language technology: how language modeling bias causes epistemic injustice

Diversity and language technology: how language modeling bias causes epistemic injustice

Governing AI, governing climate change?

On Epistemic Extractivism and the Ethics of Data-Sharing

Contact Info

Product

Resources

About