2019 IEEE International Conference on Big Data (Big Data) 2019
DOI: 10.1109/bigdata47090.2019.9006104
|View full text |Cite
|
Sign up to set email alerts
|

Demystifying Learning Rate Policies for High Accuracy Training of Deep Neural Networks

Abstract: Learning Rate (LR) is an important hyperparameter to tune for effective training of deep neural networks (DNNs). Even for the baseline of a constant learning rate, it is non-trivial to choose a good constant value for training a DNN. Dynamic learning rates involve multi-step tuning of LR values at various stages of the training process and offer high accuracy and fast convergence. However, they are much harder to tune. In this paper, we present a comprehensive study of 13 learning rate functions and their asso… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
36
0
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
3
1

Relationship

1
9

Authors

Journals

citations
Cited by 89 publications
(38 citation statements)
references
References 21 publications
0
36
0
2
Order By: Relevance
“…In the literature, the common learning rate used to train the CNN models varies from 0.1 to 0.0001. As 0.001 is introduced as a reasonable base learning value by some researchers, 51,52 in this study we set the learning rate equal to 0.001. The weights of the model are updated after passing a batch of the training images through the network instead of a single image at a time.…”
Section: Accuracymentioning
confidence: 99%
“…In the literature, the common learning rate used to train the CNN models varies from 0.1 to 0.0001. As 0.001 is introduced as a reasonable base learning value by some researchers, 51,52 in this study we set the learning rate equal to 0.001. The weights of the model are updated after passing a batch of the training images through the network instead of a single image at a time.…”
Section: Accuracymentioning
confidence: 99%
“…At epoch 600, the learning rate decreased from to and, at epoch 1200, to . Decreasing the learning rate is a commonly used approach because it allows greater weight changes in the beginning of the training phase and smaller changes at the end [ 15 ].…”
Section: Resultsmentioning
confidence: 99%
“…Learning rate juga menentukan kecepatan pada iterasi sehingga dapat mencapai loss function minimum. Proses training berjalan semakin cepat jika nilai learning rate semakin tinggi [15]. Namun, nilai learning rate yang terlalu tinggi juga dapat menyebabkan nilai loss function turun-naik tidak menentu, sehingga dibutuhkan beberapa kali percobaan untuk mendapatkan nilai learning rate yang optimal [16].…”
Section: Learning Rateunclassified