Numerous approaches have been recently proposed for learning fair representations that mitigate unfair outcomes in prediction tasks. A key motivation for these methods is that the representations can be used by third parties with unknown objectives. However, because current fair representations are generally not interpretable, the third party cannot use the obtained representation for exploration, or to obtain any additional insight, besides the pre-contracted prediction tasks. Thus, to increase data utility beyond prediction tasks, we argue that the representations need to be fair, yet interpretable. We propose a general framework for learning interpretable fair representations by introducing an interpretable "prior knowledge" during the representation learning process. We implement this idea and conduct experiments with ColorMNIST and Dsprite datasets. The results indicate that in addition to being intepretable, our representations attain slightly higher accuracy and fairer outcomes in a downstream classification task compared to state-of-theart fair representations. * indicates equal contribution Preprint. Under review.
Our goal is to study the predictive performance, interpretability, and fairness of machine learning models for pretrial recidivism prediction. Machine learning methods are known for their ability to automatically generate high-performance models (that sometimes even surpass human performance) from data alone. However, many of the most common machine learning approaches produce "black-box" models-models that perform well, but are too complicated for humans to understand. "Interpretable" machine learning techniques seek to produce the best of both worlds: models that perform as well as black-box approaches, but also are understandable to humans. In this study, we generate multiple black-box and interpretable machine learning models. We compare the predictive performance and fairness of the machine learning models we generate, against two models that are currently used in the justice system to predict pretrial recidivism-namely, the Risk of General Recidivism and Risk of Violent Recidivism scores from the COMPAS suite, and the New Criminal Activity and New Violent Criminal Activity scores from the Arnold Public Safety Assessment.We first evaluate the predictive performance of all models, based on their ability to predict recidivism for six different types of crime (general, violent, drug, property, felony, and misdemeanor). Recidivism is defined as a new charge for which an individual is convicted within a specified time frame (which we specify as six months or two years). We consider each type of recidivism over the two time periods to control for time, rather than to consider predictions over an arbitrarily long or short pretrial period. Next, we examine whether a model constructed using data from one region suffers in predictive performance when applied to predict recidivism in another region. Finally, we consider the latest fairness definitions created by the machine learning community. Using these definitions, we examine the behavior of the interpretable models, COMPAS, and the Arnold Public Safety Assessment, on race and gender subgroups.Our findings and contributions can be summarized as follows:• We contribute a set of interpretable machine learning models that can predict recidivism as well as black-box machine learning methods and better than COMPAS or the Arnold Public Safety Assessment for the location they were designed for. These models are potentially useful in practice. Similar to the Arnold Public Safety Assessment, some of these interpretable models can be written down as a simple table that fits on one page of paper. Others can be displayed using a set of visualizations.• We find that recidivism prediction models that are constructed using data from one location do not tend to perform as well when they are used to predict recidivism in another location, leading us to conclude that models should be constructed on data from the location where they are meant to be used, and updated periodically over time.• We reviewed the recent literature on algorithmic fairness, but most of the fairness criteria do...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.