diffGrad: An Optimization Method for Convolutional Neural Networks

Dubey, Shiv Ram; Chakraborty, Soumendu; Roy, Swalpa Kumar; Mukherjee, Snehasis; Singh, Satish Kumar; Chaudhuri, B. B.

doi:10.1109/tnnls.2019.2955777

Cited by 201 publications

(108 citation statements)

References 25 publications

Supporting

Mentioning

108

Contrasting

Order By: Relevance

“…Although the ImageNet base dataset does not include teeth images, various studies have shown that the fine-tuning of a pretrained network with several images of ImageNet helps improve the performance in disease-related problem learning using medical images [25][26][27] . The network was trained using the cross-entropy loss function and adaptive moment estimation (Adam) optimizer 28 with a learning rate of 1e-5 and a batch size of 32. We trained the network for 40,000 iterations and validated it using validation data every 1000 iterations with a classification accuracy metric to determine whether to stop the training.…”

Section: Methodsmentioning

confidence: 99%

Age-group determination of living individuals using first molar images based on artificial intelligence

Kim

Lee

Noh

et al. 2021

Sci Rep

View full text Add to dashboard Cite

Dental age estimation of living individuals is difficult and challenging, and there is no consensus method in adults with permanent dentition. Thus, we aimed to provide an accurate and robust artificial intelligence (AI)-based diagnostic system for age-group estimation by incorporating a convolutional neural network (CNN) using dental X-ray image patches of the first molars extracted via panoramic radiography. The data set consisted of four first molar images from the right and left sides of the maxilla and mandible of each of 1586 individuals across all age groups, which were extracted from their panoramic radiographs. The accuracy of the tooth-wise estimation was 89.05 to 90.27%. Performance accuracy was evaluated mainly using a majority voting system and area under curve (AUC) scores. The AUC scores ranged from 0.94 to 0.98 for all age groups, which indicates outstanding capacity. The learned features of CNNs were visualized as a heatmap, and revealed that CNNs focus on differentiated anatomical parameters, including tooth pulp, alveolar bone level, or interdental space, depending on the age and location of the tooth. With this, we provided a deeper understanding of the most informative regions distinguished by age groups. The prediction accuracy and heat map analyses support that this AI-based age-group determination model is plausible and useful.

show abstract

Section: Methodsmentioning

confidence: 99%

Age-group determination of living individuals using first molar images based on artificial intelligence

Kim

Lee

Noh

et al. 2021

Sci Rep

View full text Add to dashboard Cite

show abstract

“…Here, kernel size and kernel count are the most influential factors relative to recognition or detection performance. The hyperparameters other than the kernel size and kernel count in Table 3 are already reported the ranges of values close to optimization through various studies [ 32 , 33 , 34 ].…”

Section: Related Researchmentioning

confidence: 90%

Hyperparameter Optimization Method Based on Harmony Search Algorithm to Improve Performance of 1D CNN Human Respiration Pattern Recognition System

Kim

Geem

Han

2020

Sensors

View full text Add to dashboard Cite

In this study, we propose a method to find an optimal combination of hyperparameters to improve the accuracy of respiration pattern recognition in a 1D (Dimensional) convolutional neural network (CNN). The proposed method is designed to integrate with a 1D CNN using the harmony search algorithm. In an experiment, we used the depth of the convolutional layer of the 1D CNN, the number and size of kernels in each layer, and the number of neurons in the dense layer as hyperparameters for optimization. The experimental results demonstrate that the proposed method provided a recognition rate for five respiration patterns of approximately 96.7% on average, which is an approximately 2.8% improvement over an existing method. In addition, the number of iterations required to derive the optimal combination of hyperparameters was 2,000,000 in the previous study. In contrast, the proposed method required only 3652 iterations.

show abstract

“…We compare t-Adam mainly with Adam, but also with another robust gradient descent algorithm, such as RoAdam [21], and also present the comparison between some popular or recent optimization methods (in majority, variants of Adam, i.e. AdaBound [18], AdamW [31], DiffGrad [32], RAdam [33], PAdam [34], Yogi [35], and LaProp [36]) and their t-versions. 1 Note that we are not exhaustive in our selection and that the t-momentum can be integrated in other momentum-based optimization methods.…”

Section: Methodsmentioning

confidence: 99%

“…Proof: First, we start by noticing that the basic bound of the regret from the convergence proof by Reddi et al [8] also holds for t-Adam, that is, (32) where…”

Section: A Proof Of Theoremmentioning

confidence: 99%

Robust Stochastic Gradient Descent With Student-t Distribution Based First-Order Momentum

Ilboudo

Kobayashi

Sugimoto

2022

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

diffGrad: An Optimization Method for Convolutional Neural Networks

Cited by 201 publications

References 25 publications

Age-group determination of living individuals using first molar images based on artificial intelligence

Age-group determination of living individuals using first molar images based on artificial intelligence

Hyperparameter Optimization Method Based on Harmony Search Algorithm to Improve Performance of 1D CNN Human Respiration Pattern Recognition System

Robust Stochastic Gradient Descent With Student-t Distribution Based First-Order Momentum

Contact Info

Product

Resources

About