Race and gender have long sociopolitical histories of classification in technical infrastructures-from the passport to social media. Facial analysis technologies are particularly pertinent to understanding how identity is operationalized in new technical systems. What facial analysis technologies can do is determined by the data available to train and evaluate them with. In this study, we specifically focus on this data by examining how race and gender are defined and annotated in image databases used for facial analysis. We found that the majority of image databases rarely contain underlying source material for how those identities are defined. Further, when they are annotated with race and gender information, database authors rarely describe the process of annotation. Instead, classifications of race and gender are portrayed as insignificant, indisputable, and apolitical. We discuss the limitations of these approaches given the sociohistorical nature of race and gender. We posit that the lack of critical engagement with this nature renders databases opaque and less trustworthy. We conclude by encouraging database authors to address both the histories of classification inherently embedded into race and gender, as well as their positionality in embedding such classifications.
Transgender individuals in the United States face significant threats to interpersonal safety; however, there has as yet been relatively little research in the HCI and CSCW communities to document transgender individuals' experiences of technology-mediated safety and harm. In this study, we interviewed 12 transgender and non-binary individuals to understand how they find, create, and navigate safe spaces using technology. Managing safety was a universal concern for our transgender participants, and they experienced complex manifestations of harm through technology. We found that harmful experiences for trans users could arise as targeted or incidental affronts, as sourced from outsiders or insiders, and as directed against individuals or entire communities.. Notably, some violations implicated technology design, while others tapped broader social dynamics. Reading our findings through the notions of 'space" and 'place," we unpack challenges and opportunities for building safer futures with transfolk, other vulnerable users, and their allies.
Investigations of facial analysis (FA) technologies-such as facial detection and facial recognition-have been central to discussions about Artificial Intelligence's (AI) impact on human beings. Research on automatic gender recognition, the classification of gender by FA technologies, has raised potential concerns around issues of racial and gender bias. In this study, we augment past work with empirical data by conducting a systematic analysis of how gender classification and gender labeling in computer vision services operate when faced with gender diversity. We sought to understand how gender is concretely conceptualized and encoded into commercial facial analysis and image labeling technologies available today. We then conducted a two-phase study: (1) a system analysis of ten commercial FA and image labeling services and (2) an evaluation of five services using a custom dataset of diverse genders using self-labeled Instagram images. Our analysis highlights how gender is codified into both classifiers and data standards. We found that FA services performed consistently worse on transgender individuals and were universally unable to classify non-binary genders. In contrast, image labeling often presented multiple gendered concepts. We also found that user perceptions about gender performance and identity contradict the way gender performance is encoded into the computer vision infrastructure. We discuss our findings from three perspectives of gender identity (self-identity, gender performativity, and demographic identity) and how these perspectives interact across three layers: the classification infrastructure, the third-party applications that make use of that infrastructure, and the individuals who interact with that software. We employ Bowker and Star's concepts of "torque" and "residuality" to further discuss the social implications of gender classification. We conclude by outlining opportunities for creating more inclusive classification infrastructures and datasets, as well as with implications for policy.CCS Concepts: • Social and professional topics → User characteristics; Gender.
Data is a crucial component of machine learning. The field is reliant on data to train, validate, and test models. With increased technical capabilities, machine learning research has boomed in both academic and industry settings, and one major focus has been on computer vision. Computer vision is a popular domain of machine learning increasingly pertinent to real-world applications, from facial recognition in policing to object detection for autonomous vehicles. Given computer vision's propensity to shape machine learning research and impact human life, we seek to understand disciplinary practices around dataset documentation-how data is collected, curated, annotated, and packaged into datasets for computer vision researchers and practitioners to use for model tuning and development. Specifically, we examine what dataset documentation communicates about the underlying values of vision data and the larger practices and goals of computer vision as a field. To conduct this study, we collected a corpus of about 500 computer vision datasets, from which we sampled 114 dataset publications across different vision tasks. Through both a structured and thematic content analysis, we document a number of values around accepted data practices, what makes desirable data, and the treatment of humans in the dataset construction process. We discuss how computer vision datasets authors value efficiency at the expense of care; universality at the expense of contextuality; impartiality at the expense of positionality; and model work at the expense of data work. Many of the silenced values we identify sit in opposition with social computing practices. We conclude with suggestions on how to better incorporate silenced values into the dataset creation and curation process. CCS Concepts: • Human-centered computing → Collaborative and social computing; • Computing methodologies → Artificial intelligence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.