2019
DOI: 10.48550/arxiv.1912.10784
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

An improper estimator with optimal excess risk in misspecified density estimation and logistic regression

Abstract: We introduce a procedure for predictive conditional density estimation under logarithmic loss, which we call SMP (Sample Minmax Predictor). This predictor minimizes a new general excess risk bound, which critically remains valid under model misspecification. On standard examples, this bound scales as d/n where d is the dimension of the model and n the sample size, regardless of the true distribution. The SMP, which is an improper (out-of-model) procedure, improves over proper (within-model) estimators (such as… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
7
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(8 citation statements)
references
References 57 publications
1
7
0
Order By: Relevance
“…Finally, Foster et al [25] exploited properties of improper estimators for conditional density estimation with the logistic loss. Mourtada and Gaïffas [47] extended this work to remove log(n) factors for misspecified conditional density estimation in the parametric setting and obtain improved rates for misspecified Gaussian linear regression.…”
Section: Contemporary Results On Density Estimationmentioning
confidence: 99%
See 1 more Smart Citation
“…Finally, Foster et al [25] exploited properties of improper estimators for conditional density estimation with the logistic loss. Mourtada and Gaïffas [47] extended this work to remove log(n) factors for misspecified conditional density estimation in the parametric setting and obtain improved rates for misspecified Gaussian linear regression.…”
Section: Contemporary Results On Density Estimationmentioning
confidence: 99%
“…A natural direction for future work is to extend our results beyond the well-specified setting. While recent work [47,30] has made progress on providing (excess) risk bounds in the presence of misspecification, it remains an open problem to characterize the minimax rates for conditional density estimation with misspecified models without tail assumptions on the densities.…”
Section: Discussionmentioning
confidence: 99%
“…Apart from [Foster et al, 2018], which is non-practical and also based on an online-to-batch conversion, we are only aware of the works of [Mourtada and Gaïffas, 2019] and [Marteau-Ferey et al, 2019] that improve the exponential constant O(e B ) in the statistical setting. [Marteau-Ferey et al, 2019] make additional assumptions on the data distribution (self-concordance, well-specified model, capacity and source conditions).…”
Section: Online-to-batch Conversionmentioning
confidence: 99%
“…Though the latter is proper, using generalized self-concordance properties they could avoid the exponential constant in B under additional assumptions including a well-specified problem, capacity and source conditions. In parallel and independently of this work, [Mourtada and Gaïffas, 2019] have also designed a practical improper algorithm in the statistical setting based on ERM with an improper regularization using virtual data. They could provide an upper-bound on the excess risk in expectation of order O((d + B 2 )/n).…”
Section: Introductionmentioning
confidence: 99%
“…The paper of Hardt, Recht, and Singer [19] on the stability of gradient descent methods has generated a wave of interest in this direction. Recent works use various notions of stability in their analysis: some authors are motivated by the analysis of gradient descent algorithms [31,29,15,4], while others use the notion of average stability to obtain the in-expectation O(1/n) rate for regularized regression [28,17,46] and some more specific improper learning procedures [36,37]. One of the key open questions left in [19] is related to the lack of high probability generalization bounds.…”
Section: Introductionmentioning
confidence: 99%