2020
DOI: 10.48550/arxiv.2012.02717
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Derandomizing Knockoffs

Abstract: Model-X knockoffs is a general procedure that can leverage any feature importance measure to produce a variable selection algorithm, which discovers true effects while rigorously controlling the number or fraction of false positives. Model-X knockoffs is a randomized procedure which relies on the one-time construction of synthetic (random) variables. This paper introduces a derandomization method by aggregating the selection results across multiple runs of the knockoffs algorithm. The derandomization step is d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 66 publications
0
4
0
Order By: Relevance
“…In this work, we propose multi split conformal prediction, a simple method based on Markov's inequality to aggregate split conformal prediction intervals across multiple splits. The proposed method is similar in spirit to p-value aggregation (van de Wiel et al, 2009;Meinshausen et al, 2009;DiCiccio et al, 2020) and stability selection (Meinshausen and Bühlmann, 2010;Shah and Samworth, 2013;Ren et al, 2020). In particular, the multi split prediction set includes those points that are included in single split prediction intervals with frequency greater than a user defined threshold.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…In this work, we propose multi split conformal prediction, a simple method based on Markov's inequality to aggregate split conformal prediction intervals across multiple splits. The proposed method is similar in spirit to p-value aggregation (van de Wiel et al, 2009;Meinshausen et al, 2009;DiCiccio et al, 2020) and stability selection (Meinshausen and Bühlmann, 2010;Shah and Samworth, 2013;Ren et al, 2020). In particular, the multi split prediction set includes those points that are included in single split prediction intervals with frequency greater than a user defined threshold.…”
Section: Methodsmentioning
confidence: 99%
“…The value λ = 0 reduces (6) to Markov's bound. However, positive values of λ corresponds to tighter bounds achievable under constraints on the shape of the distribution of V β (Shah and Samworth, 2013;Huber, 2019;Ren et al, 2020). For k = 1 and λ = B − 1, assumption (5) holds if and only if φ…”
Section: Multi Split Conformal Predictionmentioning
confidence: 99%
“…We comment that ∆ j in ( 9) is calculated for only one run of the knockoffs procedure, i.e., we generate the knockoffs only once. This is different from the derandomized knockoffs method recently proposed by Ren et al (2020), which aggregates the selection results across multiple runs of knockoffs to reduce the randomness of the knockoffs generation.…”
Section: Resampling Importance Score and Knockoffs Filteringmentioning
confidence: 96%
“…We repeat all tests with 100 independent realizations of the Uj variables in (16); this allows some understanding and a possible reduction of the variability of any findings, as our method is randomized. Alternatively, one may repeat the entire analysis starting from the generation of the knockoffs [82]; however, that would be impractical for a data set of this size. In comparison, the cost of resampling the Uj variables many times is negligible.…”
Section: Searching For Consistent Associationsmentioning
confidence: 99%