Demystifying Learning Rate Policies for High Accuracy Training of Deep Neural Networks

Wu, Yanzhao; Liu, Ling; Bae, Jinsoo; Chow, Ka-Ho; Iyengar, Arun; Pu, Calton; Wei, Wüchang; Yu, Lei; Zhang, Qi

doi:10.1109/bigdata47090.2019.9006104

Cited by 89 publications

(38 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In the literature, the common learning rate used to train the CNN models varies from 0.1 to 0.0001. As 0.001 is introduced as a reasonable base learning value by some researchers, 51,52 in this study we set the learning rate equal to 0.001. The weights of the model are updated after passing a batch of the training images through the network instead of a single image at a time.…”

Section: Accuracymentioning

confidence: 99%

Evolving convolutional neural network parameters through the genetic algorithm for the breast cancer classification problem

Davoudi

Thulasiraman

2021

SIMULATION

View full text Add to dashboard Cite

Breast cancer is the most frequently diagnosed cancer and the leading cause of cancer mortality in women around the world. However, it can be controlled effectively by early diagnosis, followed by effective treatment. Clinical specialists take the advantages of computer-aided diagnosis (CAD) systems to make their diagnosis as accurate as possible. Deep learning techniques, such as the convolutional neural network (CNN), due to their classification capabilities on learned feature methods and ability of working with complex images, have been widely adopted in CAD systems. The parameters of the network, including the weights of the convolution filters and the weights of the fully connected layers, play a crucial role in the classification accuracy of any CNN model. The back-propagation technique is the most frequently used approach for training the CNN. However, this technique has some disadvantages, such as getting stuck in local minima. In this study, we propose to optimize the weights of the CNN using the genetic algorithm (GA). The work consists of designing a CNN model to facilitate the classification process, training the model using three different optimizers (mini-batch gradient descent, Adam, and GA), and evaluating the model through various experiments on the BreakHis dataset. We show that the CNN model trained through the GA performs as well as the Adam optimizer with a classification accuracy of 85%.

show abstract

Section: Accuracymentioning

confidence: 99%

Evolving convolutional neural network parameters through the genetic algorithm for the breast cancer classification problem

Davoudi

Thulasiraman

2021

SIMULATION

View full text Add to dashboard Cite

show abstract

“…At epoch 600, the learning rate decreased from

and, at epoch 1200, to

. Decreasing the learning rate is a commonly used approach because it allows greater weight changes in the beginning of the training phase and smaller changes at the end [ 15 ].…”

Section: Resultsmentioning

confidence: 99%

A Simulation-Data-Based Machine Learning Model for Predicting Basic Parameter Settings of the Plasticizing Process in Injection Molding

2021

View full text Add to dashboard Cite

The optimal machine settings in polymer processing are usually the result of time-consuming and expensive trials. We present a workflow that allows the basic machine settings for the plasticizing process in injection molding to be determined with the help of a simulation-driven machine learning model. Given the material, screw geometry, shot weight, and desired plasticizing time, the model is able to predict the back pressure and screw rotational speed required to achieve good melt quality. We show how data sets can be pre-processed in order to obtain a generalized model that performs well. Various supervised machine learning algorithms were compared, and the best approach was evaluated in experiments on a real machine using the predicted basic machine settings and three different materials. The neural network model that we trained generalized well with an overall absolute mean error of 0.27% and a standard deviation of 0.37% on unseen data (the test set). The experiments showed that the mean absolute errors between the real and desired plasticizing times were sufficiently small, and all predicted operating points achieved good melt quality. Our approach can provide the operators of injection molding machines with predictions of suitable initial operating points and, thus, reduce costs in the planning phase. Further, this approach gives insights into the factors that influence melt quality and can, therefore, increase our understanding of complex plasticizing processes.

show abstract

“…Learning rate juga menentukan kecepatan pada iterasi sehingga dapat mencapai loss function minimum. Proses training berjalan semakin cepat jika nilai learning rate semakin tinggi [15]. Namun, nilai learning rate yang terlalu tinggi juga dapat menyebabkan nilai loss function turun-naik tidak menentu, sehingga dibutuhkan beberapa kali percobaan untuk mendapatkan nilai learning rate yang optimal [16].…”

Section: Learning Rateunclassified

People Counting for Public Transportations Using You Only Look Once Method

Kusuma

Usman

Saidah

2021

J. Tek. Inform. (JUTIF)

View full text Add to dashboard Cite

People counting have been widely used in life, including public transportations such as train, airplane, and others. Service operators usually count the amount of passengers manually using a hand counter. Nowadays, in an era that most of human-things are digital, this method is certainly consuming enough time and energy. Therefore, this research is proposed so the service operator doesn't have to count manually with a hand counter, but using an image processing with You Only Look Once (YOLO) method. This project is expected that people counting is no longer done manually, but already based on computer vision. This Final Project uses YOLOv4 that is the latest method in detecting untill 80 classes of object. Then it will use transfer learning as well to change the number of classes to 1 class. This research was done by using Python programming language with various platforms. This research also used three training data scenarios and two testing data scenarios. Parameters measured are accuration, precision, recall, F1 score, Intersection of Union (IoU), and mean Average Precision (mAP). The best configurations used are learning rate 0.001, random value 0, and sub divisions 32. And the best accuration for this system is 69% with the datasets that has been trained before. The pre-trained weights have 72.68% of accuracy, 77% precision, and 62.88% average IoU. This research has resulted a proper performance for detecting and counting people on public transportations.

show abstract

Demystifying Learning Rate Policies for High Accuracy Training of Deep Neural Networks

Cited by 89 publications

References 21 publications

Evolving convolutional neural network parameters through the genetic algorithm for the breast cancer classification problem

Evolving convolutional neural network parameters through the genetic algorithm for the breast cancer classification problem

A Simulation-Data-Based Machine Learning Model for Predicting Basic Parameter Settings of the Plasticizing Process in Injection Molding

People Counting for Public Transportations Using You Only Look Once Method

Contact Info

Product

Resources

About