2019
DOI: 10.48550/arxiv.1910.12249
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

An Adaptive and Momental Bound Method for Stochastic Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(13 citation statements)
references
References 0 publications
0
13
0
Order By: Relevance
“…We implemented UCondDGCN on the PyTorch platform [33] and conducted experiments on a single NVIDIA TITAN V GPU. We optimized the model by the AdaMod optimizer [10] for 110 epochs with a batch size of 256, in which the learning rate was initially Table 1: Quantitative comparisons with state-of-the-art methods on Human3.6M under protocol #1 and protocol #2, where methods marked with † are video-based; T denotes the number of input frames; and CPN and HR-Net denote the input 2D poses are estimated by [5] and [41], respectively. The best and second-best results are marked in bold and underlined, respectively.…”
Section: Methodsmentioning
confidence: 99%
“…We implemented UCondDGCN on the PyTorch platform [33] and conducted experiments on a single NVIDIA TITAN V GPU. We optimized the model by the AdaMod optimizer [10] for 110 epochs with a batch size of 256, in which the learning rate was initially Table 1: Quantitative comparisons with state-of-the-art methods on Human3.6M under protocol #1 and protocol #2, where methods marked with † are video-based; T denotes the number of input frames; and CPN and HR-Net denote the input 2D poses are estimated by [5] and [41], respectively. The best and second-best results are marked in bold and underlined, respectively.…”
Section: Methodsmentioning
confidence: 99%
“…Some research shows that the learning rate is one of the most important hyperparameters for deep learning because it directly affects the gradient descents and controls the speed of network convergence to the point of global minima by navigation through a non-convex loss surface [6]. How find an optimal learning rate becomes a huge challenge for our experiments.…”
Section: Learning Rate Schedulermentioning
confidence: 99%
“…As for optimizers, we used very popular Adam optimizer. We also tried recently introduced diffGrad [16] and Adamod [17] but they didn't provide us with any improvements (we didn't perform a hyper-parameter tuning). Comparison of optimizers depicted on Fig.…”
Section: Model Optimizationmentioning
confidence: 99%