2020
DOI: 10.48550/arxiv.2007.01283
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Floodgate: inference for model-free variable importance

Abstract: Many modern applications seek to understand the relationship between an outcome variable Y and a covariate X in the presence of a (possibly high-dimensional) confounding variable Z. Although much attention has been paid to testing whether Y depends on X given Z, in this paper we seek to go beyond testing by inferring the strength of that dependence. We first define our estimand, the minimum mean squared error (mMSE) gap, which quantifies the conditional relationship between Y and X in a way that is determinist… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(8 citation statements)
references
References 73 publications
0
8
0
Order By: Relevance
“…Remark 2.5 (Connection to Zhang and Janson [150]). When d = 1, the numerator of ρ 2 (Y, Z|X) is equal to the minimum mean squared error gap (mMSE gap): 2 , which has been used to quantify the conditional relationship between Y and Z given X in the recent paper [150].…”
Section: Kpc: the Population Versionmentioning
confidence: 99%
See 1 more Smart Citation
“…Remark 2.5 (Connection to Zhang and Janson [150]). When d = 1, the numerator of ρ 2 (Y, Z|X) is equal to the minimum mean squared error gap (mMSE gap): 2 , which has been used to quantify the conditional relationship between Y and Z given X in the recent paper [150].…”
Section: Kpc: the Population Versionmentioning
confidence: 99%
“…Remark 2.5 (Connection to Zhang and Janson [150]). When d = 1, the numerator of ρ 2 (Y, Z|X) is equal to the minimum mean squared error gap (mMSE gap): 2 , which has been used to quantify the conditional relationship between Y and Z given X in the recent paper [150]. Note that mMSE gap is not invariant under arbitrary scalings of Y , but ρ 2 (Y, Z|X) (which is equal to the squared partial correlation ρ 2 Y Z•X ; see Proposition 2.1) is.…”
Section: Kpc: the Population Versionmentioning
confidence: 99%
“…Examples include pre-validation (Tibshirani and Efron, 2002;Höfling and Tibshirani, 2008) and cross-fitting (e.g, Newey and Robins, 2018)). Further, confidence intervals for prediction accuracy are used to evaluate variable importance (Williamson et al, 2021;Zhang and Janson, 2020). We suspect that our nested cross-validation proposal could be adapted to improve the accuracy of these and related approaches.…”
Section: Discussionmentioning
confidence: 99%
“…Some propose model class reliance (MCR) [21,60] that investigate the feature importance across a set of well-performed models on the same data set, instead of one particular model in use. More related approaches that construct confidence intervals for feature importance include CPI [75], GCM [57], Floodgate [86], and VIME [78,77]. Many are only applicable for regression tasks and not classification, others make specific distributional assumptions that might not hold for general ML problems.…”
Section: Related Workmentioning
confidence: 99%