Algorithmic bias arises in machine learning when models that may have reasonable overall accuracy are biased in favor of ‘good’ outcomes for one side of a sensitive category, for example gender or race. The bias will manifest as an underestimation of good outcomes for the under-represented minority. In a sense, we should not be surprised that a model might be biased when it has not been ‘asked’ not to be; reasonable accuracy can be achieved by ignoring the under-represented minority. A common strategy to address this issue is to include fairness as a component in the learning objective. In this paper, we consider including fairness as an additional criterion in model training and propose a multi-objective optimization strategy using Pareto Simulated Annealing that optimizes for both accuracy and underestimation bias. Our experiments show that this strategy can identify families of models with members representing different accuracy/fairness tradeoffs. We demonstrate the effectiveness of this strategy on two synthetic and two real-world datasets.
Estimating conditional probabilities from data samples is the building block of many machine learning algorithms. However, these estimates often do not reflect the true underlying distribution. With ML systems becoming more ubiquitous in high-risk decision making, it is critical to investigate how the predictions made by the algorithm can misrepresent the actual data distribution. To this end, we explore the notion of underestimation bias that captures the extent to which a learned model predictions deviate from the true distribution. Since fixing underestimation bias may come at the cost of accuracy, we propose a multi-objective optimization strategy to obtain a diverse set of models with members representing different accuracy/faithfulness tradeoffs. We empirically evaluate our framework on two synthetic and twelve real-world datasets. We show that our framework can address underestimation bias while still maintaining adequate overall generalization accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.