As AI-based systems increasingly impact many areas of our lives, auditing these systems for fairness is an increasingly high-stakes problem. Traditional group fairness metrics can miss discrimination against individuals and are difficult to apply after deployment.Counterfactual fairness describes an individualized notion of fairness, but is even more challenging to evaluate after deployment. We present prediction sensitivity, an approach for continual audit of counterfactual fairness in deployed classifiers. Prediction sensitivity helps answer the question: would this prediction have been different, if this individual had belonged to a different demographic group-for every prediction made by the deployed model. Prediction sensitivity can leverage correlations between protected status and other features, and does not require protected status information at prediction time. Our empirical results demonstrate that prediction sensitivity is effective for detecting violations of counterfactual fairness.
Deep learning has produced big advances in artificial intelligence, but trained neural networks often reflect and amplify bias in their training data, and thus produce unfair predictions. We propose a novel measure of individual fairness, called prediction sensitivity, that approximates the extent to which a particular prediction is dependent on a protected attribute. We show how to compute prediction sensitivity using standard automatic differentiation capabilities present in modern deep learning frameworks, and present preliminary empirical results suggesting that prediction sensitivity may be effective for measuring bias in individual predictions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.