What would data science look like if its key critics were engaged to help improve it, and how might critiques of data science improve with an approach that considers the day-to-day practices of data science? This article argues for scholars to bridge the conversations that seek to critique data science and those that seek to advance data science practice to identify and create the social and organizational arrangements necessary for a more ethical data science. We summarize four critiques that are commonly made in critical data studies: data are inherently interpretive, data are inextricable from context, data are mediated through the sociomaterial arrangements that produce them, and data serve as a medium for the negotiation and communication of values. We present qualitative research with academic data scientists, “data for good” projects, and specialized cross-disciplinary engineering teams to show evidence of these critiques in the day-to-day experience of data scientists as they acknowledge and grapple with the complexities of their work. Using ethnographic vignettes from two large multiresearcher field sites, we develop a set of concepts for analyzing and advancing the practice of data science and improving critical data studies, including (1) communication is central to the data science endeavor; (2) making sense of data is a collective process; (3) data are starting, not end points, and (4) data are sets of stories. We conclude with two calls to action for researchers and practitioners in data science and critical data studies alike. First, creating opportunities for bringing social scientific and humanistic expertise into data science practice simultaneously will advance both data science and critical data studies. Second, practitioners should leverage the insights from critical data studies to build new kinds of organizational arrangements, which we argue will help advance a more ethical data science. Engaging the insights of critical data studies will improve data science. Careful attention to the practices of data science will improve scholarly critiques. Genuine collaborative conversations between these different communities will help push for more ethical, and better, ways of knowing in increasingly datum-saturated societies.
This essay draws on qualitative social science to propose a critical intellectual infrastructure for data science of social phenomena. Qualitative sensibilitiesinterpretivism, abductive reasoning, and reflexivity in particular-could address methodological problems that have emerged in data science and help extend the frontiers of social knowledge. First, an interpretivist lens-which is concerned with the
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.