Controversies around race and machine learning have sparked debate among computer scientists over how to design machine learning systems that guarantee fairness. These debates rarely engage with how racial identity is embedded in our social experience, making for sociological and psychological complexity. This complexity challenges the paradigm of considering fairness to be a formal property of supervised learning with respect to protected personal attributes. Racial identity is not simply a personal subjective quality. For people labeled "Black" it is an ascribed political category that has consequences for social differentiation embedded in systemic patterns of social inequality achieved through both social and spatial segregation. In the United States, racial classification can best be understood as a system of inherently unequal status categories that places whites as the most privileged category while signifying the Negro/black category as stigmatized. Social stigma is reinforced through the unequal distribution of societal rewards and goods along racial lines that is reinforced by state, corporate, and civic institutions and practices. This creates a dilemma for society and designers: be blind to racial group disparities and thereby reify racialized social inequality by no longer measuring systemic inequality, or be conscious of racial categories in a way that itself reifies race. We propose a third option. By preceding group fairness interventions with unsupervised learning to dynamically detect patterns of segregation, machine learning systems can mitigate the root cause of social disparities, social segregation and stratification, without further anchoring status categories of disadvantage.
In this work, we propose the first framework for integrating Differential Privacy (DP) and Contextual Integrity (CI). DP is a property of an algorithm that injects statistical noise to obscure information about individuals represented within a database. CI defines privacy as information flow that is appropriate to social context. Analyzed together, these paradigms outline two dimensions on which to analyze privacy of information flows: descriptive and normative properties. We show that our new integrated framework provides benefits to both CI and DP that cannot be attained when each definition is considered in isolation: it enables contextuallyguided tuning of the ǫ parameter in DP, and it enables CI to be applied to a broader set of information flows occurring in real-world systems, such as those involving PETs and machine learning. We conclude with a case study based on the use of DP in the U.S. Census Bureau.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.