2021
DOI: 10.48550/arxiv.2107.05847
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges

Abstract: Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for supervised machine learning, can be employed. After introducing HPO from a general perspective, this paper reviews import… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
29
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
3
2

Relationship

2
6

Authors

Journals

citations
Cited by 19 publications
(29 citation statements)
references
References 80 publications
0
29
0
Order By: Relevance
“…Hyperparameter Search Usually, a common part of the ML pipeline is to perform some sort of hyperparameter search. The corresponding tuning strategies still remains an open area of research (see Bischl et al, 2021 for a comprehensive overview), but the following rules of thumb exist: If there are very few parameters that can be searched exhaustively under the computation budget, grid search or Bayesian optimization can be applied. Otherwise, random search is preferred, as it explores the search space more efficiently (Bergstra & Bengio, 2012).…”
Section: Codebase and Modelsmentioning
confidence: 99%
“…Hyperparameter Search Usually, a common part of the ML pipeline is to perform some sort of hyperparameter search. The corresponding tuning strategies still remains an open area of research (see Bischl et al, 2021 for a comprehensive overview), but the following rules of thumb exist: If there are very few parameters that can be searched exhaustively under the computation budget, grid search or Bayesian optimization can be applied. Otherwise, random search is preferred, as it explores the search space more efficiently (Bergstra & Bengio, 2012).…”
Section: Codebase and Modelsmentioning
confidence: 99%
“…Hyperparameter optimization (HPO) methods aim to identify a well-performing hyperparameter configuration (HPC) λ ∈ Λ for an ML algorithm I λ [1]. An ML learner or inducer I configured by hyperparameters λ ∈ Λ maps a data set D ∈ D to a model f , i.e., I : D × Λ → H, (D, λ) → f .…”
Section: Hyperparameter Optimizationmentioning
confidence: 99%
“…where λ * denotes the theoretical optimum and c maps an arbitrary HPC to (possibly multiple) target metrics. The classical HPO problem is defined as λ * ∈ arg min λ∈ Λ GE(I, J , ρ, λ), i.e., the goal is to minimize the estimated generalization error when I (learner), J (resampling splits), and ρ (performance measure) are fixed, see [1] for further details. Instead of optimizing only for predictive performance, other metrics such as model sparsity or computational efficiency of prediction (e.g., MACs and FLOPs or model size and memory usage) could be included, resulting in a multiobjective HPO problem [37][38][39][40][41].…”
Section: Hyperparameter Optimizationmentioning
confidence: 99%
See 1 more Smart Citation
“…Deep neural networks lie at the heart of many of the artificial intelligence applications that are ubiquitous in our society. Over the past several years, methods for training these networks have become more automatic [1,2,3,4,5] but still remain more an art than a science. This paper introduces the high-level concept of general cyclical training as another step in making it easier to optimally train neural networks.…”
Section: Introductionmentioning
confidence: 99%