2018
DOI: 10.1177/0162243918781268
|View full text |Cite
|
Sign up to set email alerts
|

Data Cleaners for Pristine Datasets: Visibility and Invisibility of Data Processors in Social Science

Abstract: This article investigates the work of processors who curate and “clean” the data sets that researchers submit to data archives for archiving and further dissemination. Based on ethnographic fieldwork conducted at the data processing unit of a major US social science data archive, I investigate how these data processors work, under which status, and how they contribute to data sharing. This article presents two main results. First, it contributes to the study of invisible technicians in science by showing that … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
50
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 53 publications
(50 citation statements)
references
References 30 publications
0
50
0
Order By: Relevance
“…In the moment, ephemeral information seeking leaves much of the labor less visible, if not invisible, to a variety of colleagues who take practices of the research infrastructure for granted. This iterative, ad hoc labor to identify and work with changes is another aspect to cleaning and processing data [14,16]. The resulting information produced about changes is really additional metadata about the scientific process itself that must be aligned to different contexts in spite of friction [6].…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…In the moment, ephemeral information seeking leaves much of the labor less visible, if not invisible, to a variety of colleagues who take practices of the research infrastructure for granted. This iterative, ad hoc labor to identify and work with changes is another aspect to cleaning and processing data [14,16]. The resulting information produced about changes is really additional metadata about the scientific process itself that must be aligned to different contexts in spite of friction [6].…”
Section: Discussionmentioning
confidence: 99%
“…In our study we constructed our field to start surfacing the invisible work behind data change in some infrastructures of scientific research by drawing upon data processing or cleaning studies. Previous work stresses the labor intensive work of data processing or cleaning [17,14,18,16]. Rawson and Muñoz [18] note that specifics of data cleaning often "reside in the general professional practices, materials, personal histories, and tools of the researchers" rather than explicitly captured and included with a data release.…”
Section: Literature Reviewmentioning
confidence: 99%
See 1 more Smart Citation
“…archives, federated data networks, virtual observatories), and mandatory data disclosure in response to policies by journals (Rousi and Laakso 2020) and funders (Andreoli-Versbach and Mueller-Langer 2014). Costs for preparing research data to be reused are high, limiting sharing behaviour even among advocates (Fecher et al 2017;Plantin 2019). These costs, including time to format, annotate, and curate the data, as well as concerns over privacy, 'scooping', and misuse, must be balanced against the promised efficiencies of data reuse (Pronk 2019).…”
Section: Data and Materials Sharingmentioning
confidence: 99%
“…Data processing work is the often laborious task to transform resources into an analyzable state that as a process can be rife with easily lost changes that shape knowledge being constructed (Paine and Ramakrishnan 2019;Paine et al, 2015;Plantin 2019). Efforts to process and clean data are often work that melds into the background, invisible to outside observers looking at the shiny elements in ecologies of work who are not always paying attention to all of the indicators (Star and Strauss 1999).…”
Section: Making Invisible Data Processing Work Visiblementioning
confidence: 99%