2021
DOI: 10.48550/arxiv.2108.02120
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Statistical Analysis of Wasserstein Distributionally Robust Estimators

Abstract: We consider statistical methods which invoke a min-max distributionally robust formulation to extract good out-of-sample performance in data-driven optimization and learning problems. Acknowledging the distributional uncertainty in learning from limited samples, the min-max formulations introduce an adversarial inner player to explore unseen covariate data. The resulting Distributionally Robust Optimization (DRO) formulations, which include Wasserstein DRO formulations (our main focus), are specified using opt… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 53 publications
0
2
0
Order By: Relevance
“…In contrast to problem (1.1), the effect of explicit regularization on the robustness of models is less direct, but this is compensated by a richer structure that can be used to study the theoretical properties of their solutions more directly. The connections between adversarial training and regularization have been intensely explored in recent years in the context of classical parametric learning settings; see [1,9,15] and references within. For example, when θ ∈ Θ = R d represents the parameters of a linear regression model and the loss function for the model is the squared loss, the following identity holds , where G p is an optimal transport distance of the form…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In contrast to problem (1.1), the effect of explicit regularization on the robustness of models is less direct, but this is compensated by a richer structure that can be used to study the theoretical properties of their solutions more directly. The connections between adversarial training and regularization have been intensely explored in recent years in the context of classical parametric learning settings; see [1,9,15] and references within. For example, when θ ∈ Θ = R d represents the parameters of a linear regression model and the loss function for the model is the squared loss, the following identity holds , where G p is an optimal transport distance of the form…”
Section: Introductionmentioning
confidence: 99%
“…In particular, in this setting ⋅ q becomes the regularization term R, the risk functional is Ĵ = √ J, and λ = √ ε. Through an equivalence like (1.3) it is possible to motivate new ways of calibrating regularization parameters in models with a convex loss function (where first order optimality conditions guarantee global optimality) as has been done in [1]. Beyond linear regression, the equivalence between adversarial training and regularization problems has also been studied in parametric binary classification settings such as logistic regression and SVMs (see [1]), as well as in distributionally robust grouped variable selection, and distributionally robust multi-output learning (see [9]).…”
Section: Introductionmentioning
confidence: 99%