2020
DOI: 10.48550/arxiv.2002.10790
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Biased Stochastic Gradient Descent for Conditional Stochastic Optimization

Yifan Hu,
Siqi Zhang,
Xin Chen
et al.

Abstract: Conditional Stochastic Optimization (CSO) covers a variety of applications ranging from metalearning and causal inference to invariant learning. However, constructing unbiased gradient estimates in CSO is challenging due to the composition structure. As an alternative, we propose a biased stochastic gradient descent (BSGD) algorithm and study the bias-variance tradeoff under different structural assumptions. We establish the sample complexities of BSGD for strongly convex, convex, and weakly convex objectives,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
11
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(12 citation statements)
references
References 18 publications
1
11
0
Order By: Relevance
“…Since this is a standard results, similar results are showed in Bernstein et al (2018);Devolder et al (2014); Hu et al (2020); Ajalloeian and Stich (2020). For the sake of completeness, we provide the…”
Section: Methodssupporting
confidence: 84%
See 2 more Smart Citations
“…Since this is a standard results, similar results are showed in Bernstein et al (2018);Devolder et al (2014); Hu et al (2020); Ajalloeian and Stich (2020). For the sake of completeness, we provide the…”
Section: Methodssupporting
confidence: 84%
“…The convergence of biased gradient has been studied via a series of previous works (Schmidt et al, 2011;Bernstein et al, 2018;Hu et al, 2020;Ajalloeian and Stich, 2020;Scaman and Malherbe, 2020). We show a similar theorem below for the sake of completeness.…”
Section: Stochastic Gradient Descent With Biased Gradientmentioning
confidence: 99%
See 1 more Smart Citation
“…To theoretically understand why GBML works well in practice, we shall comprehend the optimization properties of GBML with DNNs. Several recent works theoretically analyze GBML in the case of convex objectives [12,5,23,16,44]. However, DNNs are always non-convex, so these works do not directly apply to GBML with DNNs.…”
Section: Motivationsmentioning
confidence: 99%
“…Connection to composite optimization. The proposed doubly variance reduction algorithm shares the same spirit with the variance reduced composite optimization problem considered in Zhang and Xiao (2019a); Hu et al (2020); Tran-Dinh et al (2020); Zhang and Xiao (2019b;c), but with two main di erences. Firstly, the objective function is di erent.…”
mentioning
confidence: 99%