Proceedings of the 2020 SIAM International Conference on Data Mining 2020
DOI: 10.1137/1.9781611976236.45
|View full text |Cite
|
Sign up to set email alerts
|

HIDRA: Head Initialization across Dynamic targets for Robust Architectures

Abstract: The performance of gradient-based optimization strategies depends heavily on the initial weights of the parametric model. Recent works show that there exist weight initializations from which optimization procedures can find the taskspecific parameters faster than from uniformly random initializations, and that such a weight initialization can be learned by optimizing a specific model architecture across similar tasks via maml (Model-Agnostic Meta-Learning). Current methods are limited to populations of classif… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 7 publications
0
1
0
Order By: Relevance
“…One of the first methods to attempt few-shot learning on homogeneous predictors was chameleon [4] which used a convolutional encoder to align tasks from similar domains to a common attribute space before utilizing gradient-based few-shot methods. Similarly, other works tried to learn across tasks with a varied label spaces [9,32]. Finally, Iwata et al [19] proposed a model that uses deep set [53] based blocks to compute a task-embedding over predictor and targets of training samples (support data) which then can be combined with new unlabeled samples (query data) to perform a classification or regression without the need of retraining or fine-tuning, similar to neighbor-based approaches (we will refer to this method as HetNet throughout the rest of the paper).…”
Section: Related Workmentioning
confidence: 99%
“…One of the first methods to attempt few-shot learning on homogeneous predictors was chameleon [4] which used a convolutional encoder to align tasks from similar domains to a common attribute space before utilizing gradient-based few-shot methods. Similarly, other works tried to learn across tasks with a varied label spaces [9,32]. Finally, Iwata et al [19] proposed a model that uses deep set [53] based blocks to compute a task-embedding over predictor and targets of training samples (support data) which then can be combined with new unlabeled samples (query data) to perform a classification or regression without the need of retraining or fine-tuning, similar to neighbor-based approaches (we will refer to this method as HetNet throughout the rest of the paper).…”
Section: Related Workmentioning
confidence: 99%