2023
DOI: 10.1007/s11334-023-00540-3
|View full text |Cite
|
Sign up to set email alerts
|

Hyperparameter optimization for deep neural network models: a comprehensive study on methods and techniques

Sunita Roy,
Ranjan Mehera,
Rajat Kumar Pal
et al.

Abstract: Advancements in computing and storage technologies have significantly contributed to the adoption of deep learning (DL)-based models among machine learning (ML) experts. Although a generic model can be used in the search for a near-optimal solution in any problem domain, what makes these DL models context-sensitive is the combination of the training data and the hyperparameters. Due to the lack of inherent explainability of DL models the Hyperparameter Optimization (HPO) or tuning specific to each model is a c… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 29 publications
0
2
0
Order By: Relevance
“…Traditional hyperparameter tuning techniques have been extended to include gradientbased optimization, evolutionary algorithms, simulated annealing, particle swarm optimization, hyperband, gradient-based optimization, random search, and Bayesian optimization to improve DL models [50,51]. While these methods expand the range of tools available to improve model performance, they often share the same drawbacks: high computational requirements, significant time commitment, and a complexity that can be intimidating to practitioners not trained in ML, such as those working in the energy industry.…”
Section: Stage 1: Optimized Deep Neural Network Architecture With Optunamentioning
confidence: 99%
“…Traditional hyperparameter tuning techniques have been extended to include gradientbased optimization, evolutionary algorithms, simulated annealing, particle swarm optimization, hyperband, gradient-based optimization, random search, and Bayesian optimization to improve DL models [50,51]. While these methods expand the range of tools available to improve model performance, they often share the same drawbacks: high computational requirements, significant time commitment, and a complexity that can be intimidating to practitioners not trained in ML, such as those working in the energy industry.…”
Section: Stage 1: Optimized Deep Neural Network Architecture With Optunamentioning
confidence: 99%
“…Table 5 presents a comparison of the prediction model proposed in this paper with four other models, namely back-propagation neural network (BP), long short-term memory neural network (LSTM), bidirectional long short-term memory neural network (BiLSTM) and Attention-BiLSTM. In addition, since particle swarm algorithms (PSOs) and genetic algorithms (GAs) can also be used for hyperparameter optimization [61], the prediction model in this paper is also compared with PSO-Attention-BiLSTM model and GA-Attention-BiLSTM, and the range of hyperparameter settings is the same for all three optimization models. Table 5 showcases the prediction averages for the six-month period and the overall prediction results for the second half of the year.…”
Section: Load Agc Reserve Capacity Demand Forecastmentioning
confidence: 99%