2018
DOI: 10.1214/17-aos1572
|View full text |Cite
|
Sign up to set email alerts
|

Variable selection with Hamming loss

Abstract: We derive nonasymptotic bounds for the minimax risk of variable selection under expected Hamming loss in the Gaussian mean model in R d for classes of at most s-sparse vectors separated from 0 by a constant a > 0. In some cases, we get exact expressions for the nonasymptotic minimax risk as a function of d, s, a and find explicitly the minimax selectors. These results are extended to dependent or non-Gaussian observations and to the problem of crowdsourcing. Analogous conclusions are obtained for the probabili… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
74
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 50 publications
(79 citation statements)
references
References 23 publications
5
74
0
Order By: Relevance
“…We conclude the proof by using the fact that the function u → ψ + (n,p,u,a,σ) u is decreasing for u > 0 (cf. [20]), so that ψ + (n, p, s ′ , a, σ) ≥ s ′ s ψ + (n, p, s, a, σ).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We conclude the proof by using the fact that the function u → ψ + (n,p,u,a,σ) u is decreasing for u > 0 (cf. [20]), so that ψ + (n, p, s ′ , a, σ) ≥ s ′ s ψ + (n, p, s, a, σ).…”
Section: Resultsmentioning
confidence: 99%
“…where ε i is a standard Gaussian random variable independent of X i . Thus, conditionally on the design X, we are in the framework of variable selection in the normal means model, where the lower bound techniques developed in [20] can be applied to obtain the result.…”
Section: Non-asymptotic Bounds On the Minimax Riskmentioning
confidence: 99%
“…If the driving parameter b tends to infinity we have (see (4) and (5)) n = o(p) and the classification problem in hand becomes a high-dimensional problem. To cope with high dimensionality, some kind of sparsity of the data is often reasonably assumed.…”
Section: General Problem Statement: High-dimensional Setupmentioning
confidence: 99%
“…The problem of feature or variable selection with Hamming loss in high dimensions has been extensively studied in the literature (see, for example, [6], [4], [20], [9]), primarily in normal settings, with the majority of the observations coming from a standard normal distribution and a small fraction of observations from a normal distribution with mean µ b √ ln b and variance one. By sampling from the above set of observations with or without replacement, which asymptotically, as b → ∞, makes no difference, we arrive at the model, cf.…”
Section: Feature Selection Prior To Classificationmentioning
confidence: 99%
See 1 more Smart Citation