2019 IEEE International Conference on Data Mining (ICDM) 2019
DOI: 10.1109/icdm.2019.00068
|View full text |Cite
|
Sign up to set email alerts
|

Towards Making Deep Transfer Learning Never Hurt

Abstract: Transfer learning have been frequently used to improve deep neural network training through incorporating weights of pre-trained networks as the starting-point of optimization for regularization. While deep transfer learning can usually boost the performance with better accuracy and faster convergence, transferring weights from inappropriate networks hurts training procedure and may lead to even lower accuracy. In this paper, we consider deep transfer learning as minimizing a linear combination of empirical lo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
21
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 17 publications
(22 citation statements)
references
References 27 publications
1
21
0
Order By: Relevance
“…However, all the models were trained with eight years of data and are thus still capable of producing accurate predictions. The dataset size affects deep learning performance, as a small amount of training data may degrade the performance [51]. We recreated the DL model for the Malacca Strait (the area with the most extensive dataset size) for each variation in the training set to show the effect of the data size on the model performance (see Figure 11).…”
Section: Discussion: Influence Of Data Sizementioning
confidence: 99%
“…However, all the models were trained with eight years of data and are thus still capable of producing accurate predictions. The dataset size affects deep learning performance, as a small amount of training data may degrade the performance [51]. We recreated the DL model for the Malacca Strait (the area with the most extensive dataset size) for each variation in the training set to show the effect of the data size on the model performance (see Figure 11).…”
Section: Discussion: Influence Of Data Sizementioning
confidence: 99%
“…4. L2SP regularization, where the weights are decayed towards their pre-trained values rather than 0 during fine-tuning, improves performance when the source and target dataset are closely related, but hinders it when they are less related [21,20,37] 5. Momentum should be lower for more closely related source and target datasets [18].…”
Section: Related Workmentioning
confidence: 99%
“…They showed that a high level of regularization decaying towards the pre-trained weights is beneficial on these datasets. It has since been shown that the L2SP regularizer can result in minimal improvement or even worse performance when the source and target datasets are less related [18,37].…”
Section: L2spmentioning
confidence: 99%
See 2 more Smart Citations