2019
DOI: 10.48550/arxiv.1903.10399
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning-to-Learn Stochastic Gradient Descent with Biased Regularization

Giulia Denevi,
Carlo Ciliberto,
Riccardo Grazzi
et al.

Abstract: We study the problem of learning-to-learn: inferring a learning algorithm that works well on tasks sampled from an unknown distribution. As class of algorithms we consider Stochastic Gradient Descent on the true risk regularized by the square euclidean distance to a bias vector. We present an average excess risk bound for such a learning algorithm. This result quantifies the potential benefit of using a bias vector with respect to the unbiased case. We then address the problem of estimating the bias from a seq… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
13
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 9 publications
(13 citation statements)
references
References 10 publications
0
13
0
Order By: Relevance
“…There has also been line of recent work providing guarantees for gradient-based meta-learning (MAML) [Finn et al, 2017]. Finn et al [2019], Khodak et al [2019a] and Denevi et al [2019] work in the framework of online convex optimization (OCO) and use a notion of task similarity that assumes closeness of all tasks to a single fixed point in parameter space to provide guarantees. Khodak et al [2019b] strengthens earlier meta-learning guarantees in the OCO framework and provides bounds with more general notions of data-dependent task similarity.…”
Section: Related Workmentioning
confidence: 99%
“…There has also been line of recent work providing guarantees for gradient-based meta-learning (MAML) [Finn et al, 2017]. Finn et al [2019], Khodak et al [2019a] and Denevi et al [2019] work in the framework of online convex optimization (OCO) and use a notion of task similarity that assumes closeness of all tasks to a single fixed point in parameter space to provide guarantees. Khodak et al [2019b] strengthens earlier meta-learning guarantees in the OCO framework and provides bounds with more general notions of data-dependent task similarity.…”
Section: Related Workmentioning
confidence: 99%
“…A theoretical study was proposed by [2], but the strategies in this paper are not feasible in practice. This problem was improved recently [11,5,12,17,36,16,14,21]. The closest work to this paper is [13], where the authors propose an efficient strategy to learn the starting point of online gradient descent.…”
Section: Related Workmentioning
confidence: 99%
“…This thesis focuses on the first MAML algorithms, but the techniques here can be extended to analyze the Hessian-free multi-step MAML. Alternatively to meta-initialization algorithms such as MAML, meta-regularization approaches aim to learn a good bias for a regularized empirical risk minimization problem for intra-task learning [2,22,21,20,104,8,132]. [8] formalized a connection between meta-initialization and meta-regularization from an online learning perspective.…”
Section: Related Workmentioning
confidence: 99%