In response to growing concerns of bias, discrimination, and unfairness perpetuated by algorithmic systems, the datasets used to train and evaluate machine learning models have come under increased scrutiny. Many of these examinations have focused on the contents of machine learning datasets, finding glaring underrepresentation of minoritized groups. In contrast, relatively little work has been done to examine the norms, values, and assumptions embedded in these datasets. In this work, we conceptualize machine learning datasets as a type of informational infrastructure, and motivate a genealogy as method in examining the histories and modes of constitution at play in their creation. We present a critical history of ImageNet as an exemplar, utilizing critical discourse analysis of major texts around ImageNet’s creation and impact. We find that assumptions around ImageNet and other large computer vision datasets more generally rely on three themes: the aggregation and accumulation of more data, the computational construction of meaning, and making certain types of data labor invisible. By tracing the discourses that surround this influential benchmark, we contribute to the ongoing development of the standards and norms around data development in machine learning and artificial intelligence research.
This research shows how face masks took on discursive political significance during the early stages of the coronavirus disease 2019 pandemic in the United States. The authors argue that political divisions over masks cannot be understood by looking to partisan differences in mask-wearing behaviors alone. Instead, they show how the mask became a political symbol enrolled into patterns of affective polarization. This study relies on qualitative and computational analyses of opinion articles ( n = 7,970) and supplemental analyses of Twitter data, the transcripts of major news networks, and longitudinal survey data. First, the authors show that antimask discourse was consistently marginal and that backlash against mask refusal came to prominence and did not decline even as masking behaviors normalized and partly depolarized. Second, they show that backlash against mask refusal, rather than mask refusal itself, was the primary way masks were discussed in relation to national electoral, governmental, and partisan themes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.