The increasing impact of algorithmic decisions on people's lives compels us to scrutinize their fairness and, in particular, the disparate impacts that ostensibly-color-blind algorithms can have on different groups.Examples include credit decisioning, hiring, advertising, criminal justice, personalized medicine, and targeted policymaking, where in some cases legislative or regulatory frameworks for fairness exist and define specific protected classes. In this paper we study a fundamental challenge to assessing disparate impacts in practice:protected class membership is often not observed in the data. This is particularly a problem in lending and healthcare. We consider the use of an auxiliary dataset, such as the US census, that includes class labels but not decisions or outcomes. We show that a variety of common disparity measures are generally unidentifiable aside for some unrealistic cases, providing a new perspective on the documented biases of popular proxybased methods. We provide exact characterizations of the sharpest-possible partial identification set of disparities either under no assumptions or when we incorporate mild smoothness constraints. We further provide optimization-based algorithms for computing and visualizing these sets, which enables reliable and robust assessments -an important tool when disparity assessment can have far-reaching policy implications.We demonstrate this in two case studies with real data: mortgage lending and personalized medicine dosing.
We study the problem of estimating treatment effects when the outcome of primary interest (e.g., long-term health status) is only seldom observed but abundant surrogate observations (e.g., short-term health outcomes) are available. To investigate the role of surrogates in this setting, we derive the semiparametric efficiency lower bounds of average treatment effect (ATE) both with and without presence of surrogates, as well as several intermediary settings. These bounds characterize the best-possible precision of ATE estimation in each case, and their difference quantifies the efficiency gains from optimally leveraging the surrogates in terms of key problem characteristics when only limited outcome data are available. We show these results apply in two important regimes: when the number of surrogate observations is comparable to primary-outcome observations and when the former dominates the latter. Importantly, we take a missing-data approach that circumvents strong surrogate conditions which are commonly assumed in previous literature but almost always fail in practice. To show how to leverage the efficiency gains of surrogate observations, we propose ATE estimators and inferential methods based on flexible machine learning methods to estimate nuisance parameters that appear in the influence functions. We show our estimators enjoy efficiency and robustness guarantees under weak conditions.
No abstract
We consider the efficient estimation of a low-dimensional parameter in the presence of very high-dimensional nuisances that may depend on the parameter of interest. An important example is the quantile treatment effect (QTE) in causal inference, where the efficient estimation equation involves as a nuisance the conditional cumulative distribution evaluated at the quantile to be estimated. Debiased machine learning (DML) is a data-splitting approach to address the need to estimate nuisances using flexible machine learning methods that may not satisfy strong metric entropy conditions, but applying it to problems with estimand-dependent nuisances would require estimating too many nuisances to be practical. For the QTE estimation, DML requires we learn the whole conditional cumulative distribution function, which may be challenging in practice and stands in contrast to only needing to estimate just two regression functions as in the efficient estimation of average treatment effects. Instead, we propose localized debiased machine learning (LDML), a new three-way data-splitting approach that avoids this burdensome step and needs only estimate the nuisances at a single initial bad guess for the parameters. In particular, under a Fréchet-derivative orthogonality condition, we show the oracle estimation equation is asymptotically equivalent to one where the nuisance is evaluated at the true parameter value and we provide a strategy to target this alternative formulation: construct an initial bad guess for the estimand using one third of the data, estimate the nuisances at this value using flexible machine learning methods using another third of the data, plug in these estimates and solve the estimation equation on the last third of data, repeat with the thirds permuted, and average the solutions. In the case of QTE estimation, this involves only learning two binary regression models, for which many standard, time-tested machine learning methods exist. We prove that under certain lax rate conditions, our estimator has the same favorable asymptotic behavior as the infeasible oracle estimator that solves the estimating equation with the true nuisance functions. Thus, our proposed approach uniquely enables practically-feasible efficient estimation of important quantities in causal inference and other missing data settings such as QTEs.
We study the causal inference when not all confounders are observed and instead negative controls are available. Recent work has shown how negative controls can enable identification and efficient estimation of average treatment effects via two so-called bridge functions. In this paper, we consider a generalized average causal effect (GACE) with general interventions (discrete or continuous) and tackle the central challenge to causal inference using negative controls: the identification and estimation of the two bridge functions. Previous work has largely relied on completeness assumptions for identification and uniqueness assumptions for estimation, and mainly focused on estimating these functions parametrically. We provide a new identification strategy for GACE that avoids completeness, and propose new minimax-learning estimators for the (nonnunique) bridge functions that can accommodate general function classes such as Reproducing Kernel Hilbert spaces and neural networks and can provide theoretical guarantees even when the bridge functions are nonnunique. We study finite-sample convergence results both for estimating bridge function themselves and for the final GACE estimator. We do this under a variety of combinations of assumptions on the hypothesis and critic classes employed in the minimax estimator. Depending on how much we are willing to assume, we obtain different convergence rates. In some cases, we show that the GACE estimator may converge to truth even when our minimax bridge function estimators do not converge to any valid bridge function. And, in other cases, we show we can obtain semiparametric efficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.