2021
DOI: 10.1177/20539517211029322
|View full text |Cite
|
Sign up to set email alerts
|

Reading datasets: Strategies for interpreting the politics of data signification

Abstract: All datasets emerge from and are enmeshed in power-laden semiotic systems. While emerging data ethics curriculum is supporting data science students in identifying data biases and their consequences, critical attention to the cultural histories and vested interests animating data semantics is needed to elucidate the assumptions and political commitments on which data rest, along with the externalities they produce. In this article, I introduce three modes of reading that can be engaged when studying datasets—a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
10
0
1

Year Published

2022
2022
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(11 citation statements)
references
References 38 publications
0
10
0
1
Order By: Relevance
“…However, just as the criminality-prediction example demonstrated the utility of GTT to shed light on the broader contexts upon which an ML system's ground truth is contingent, it becomes important to recognize, here, that this employee-fit system is being developed in the setting of a technology start-up, an industry notoriously associated with both pressures and ambitions of high-growth and scalability. A connotative reading 13 (Poirier, 2021) thus allows one to understand that in this context, it is financially advantageous to develop a product that is generally applicable as opposed to one that is developed for unique situations. Both unsurprisingly and surprisingly, Adam informed me that their system is actually "not for a specific use case.…”
Section: Use-case Specificity: Ground-truthing the Vocality Of Employ...mentioning
confidence: 99%
See 2 more Smart Citations
“…However, just as the criminality-prediction example demonstrated the utility of GTT to shed light on the broader contexts upon which an ML system's ground truth is contingent, it becomes important to recognize, here, that this employee-fit system is being developed in the setting of a technology start-up, an industry notoriously associated with both pressures and ambitions of high-growth and scalability. A connotative reading 13 (Poirier, 2021) thus allows one to understand that in this context, it is financially advantageous to develop a product that is generally applicable as opposed to one that is developed for unique situations. Both unsurprisingly and surprisingly, Adam informed me that their system is actually "not for a specific use case.…”
Section: Use-case Specificity: Ground-truthing the Vocality Of Employ...mentioning
confidence: 99%
“…To augment or offer an alternative to other investigative methods (i.e. Poirier, 2021; Van Rossem and Pelizza, 2022), in which emphasis on a direct rigorous engagement with underlying datasets is justifiably correlated with decreased applicability and accessibility, GTT incorporates a retroactively deductive/logical process in which the end task itself presents the greatest clue as to how a ground truth might be established. This is not to say that the most robust and ideal iteration of GTT would not require direct engagement with the underlying datasets and labeling procedures, but rather to emphasize that even when faced with a relative lack of such information, GTT can still inform a systematic process for deducing the qualitative processes through which ground truth datasets are created 16 .…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Although the concept of data literacy remains contested, it is useful for grounding and understanding data practices in their communicative settings, in relation to data's representational (communicative) and functional (computational) status (Poirier, 2021). Theoretical and practical work on data literacy generally seeks to improve individual competencies and critical awareness (D’Ignazio and Bhargava, 2015; Frank et al, 2016) as an aspect of contemporary ‘data citizenship’ (Carmi and Yates, 2020) and shifting big data practices.…”
Section: Defining Data Literacy Beyond Individual Competenciesmentioning
confidence: 99%
“…These intersecting logics of datasets and archives can trace not only the provenance of datasets, but also their embeddedness within particular contexts (Gebru et al 2021). Such multifaceted methods for 'reading' datasets (Poirier 2021) can help to surface the technical, cultural, and political aspects of data in relation to other cultural records.…”
Section: Introductionmentioning
confidence: 99%