2020 International Joint Conference on Neural Networks (IJCNN) 2020
DOI: 10.1109/ijcnn48605.2020.9206989
|View full text |Cite
|
Sign up to set email alerts
|

Joint Progressive Knowledge Distillation and Unsupervised Domain Adaptation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 21 publications
(7 citation statements)
references
References 13 publications
0
7
0
Order By: Relevance
“…While [30] achieves good performance for MTDA, we argue that it does not take full advantage of the current UDA techniques and both these techniques require a lot of resources. In addition, since it only uses the source domain for distillation which has been shown in [21] to have limited knowledge transfer and can reduce accuracy. Also, This distillation with source only guarantees consistency w.r.t to source domain and does not guarantee that features will remain domaininvariant with previous target domains.…”
Section: Multi-target Domain Adaptationmentioning
confidence: 99%
“…While [30] achieves good performance for MTDA, we argue that it does not take full advantage of the current UDA techniques and both these techniques require a lot of resources. In addition, since it only uses the source domain for distillation which has been shown in [21] to have limited knowledge transfer and can reduce accuracy. Also, This distillation with source only guarantees consistency w.r.t to source domain and does not guarantee that features will remain domaininvariant with previous target domains.…”
Section: Multi-target Domain Adaptationmentioning
confidence: 99%
“…Previous Ensemble-Distillation Method: In this section, we explain how the concepts of the previous ensembledistillation works [42][43][44][45][46][47][48][49] can be borrowed to perform semantic segmentation based UDA ensemble-distillation tasks. Typically, these works view T as a set of probabilistic models, and complete the ensemble-distillation process through minimizing the negative log-likelihood loss L KL between the expected outputs from the ensemble and the student model, as depicted in Fig.…”
Section: Preliminarymentioning
confidence: 99%
“…To address such a problem, the concept of ensembledistillation [42][43][44][45][46][47][48][49][50] can be leveraged since its focus is on designing an effective distillation process instead of a costly end-to-end ensemble learning framework. Typically, these ensemble distillation frameworks view the members in an ensemble as probabilistic models, and transfer the knowledge using expected certainty outputs.…”
Section: Introductionmentioning
confidence: 99%
“…Several works [13] have been proposed in the classification paradigm to learn the representations of out-of-distribution data. Particularly [14,15] utilize a knowledge distillation strategy in image classification to learn out-of-distribution data. However, to the best of our knowledge only [16] has proposed a domain adaptation strategy in a regression setting to improve keypoint detection.…”
Section: Introductionmentioning
confidence: 99%