2021
DOI: 10.48550/arxiv.2104.03164
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Distilling and Transferring Knowledge via cGAN-generated Samples for Image Classification and Regression

Abstract: Knowledge distillation (KD) has been actively studied for image classification tasks in deep learning, aiming to improve the performance of a lightweight (student) model based on the knowledge from a heavyweight (teacher) model. However, applying KD in image regression with a scalar response variable has been rarely studied, and there exists no KD method applicable to both classification and regression tasks yet. Moreover, existing KD methods often require a practitioner to carefully select or adjust the teach… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 31 publications
0
4
0
Order By: Relevance
“…In contrast, our joint training method is non-adversarial and can even be implemented by directly minimizing one unified objective. On a conceptual level, our MDN training approach is also related to work on teacher-student networks and knowledge distillation [27,40,61,8]. In a knowledge distillation problem, a teacher network is utilized to improve the performance of a more lightweight student network.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…In contrast, our joint training method is non-adversarial and can even be implemented by directly minimizing one unified objective. On a conceptual level, our MDN training approach is also related to work on teacher-student networks and knowledge distillation [27,40,61,8]. In a knowledge distillation problem, a teacher network is utilized to improve the performance of a more lightweight student network.…”
Section: Related Workmentioning
confidence: 99%
“…In a knowledge distillation problem, a teacher network is utilized to improve the performance of a more lightweight student network. While knowledge distillation for regression is not a particularly well-studied topic, it has been studied for image-based regression tasks in very recent work [8]. A student network is there enhanced by augmenting its training set with images and pseudo targets generated by a conditional GAN and a pre-trained teacher network, respectively.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Xu et al (Xu, Liu, Li and Loy, 2020) further found that self-supervision signals can effectively transfer the hidden information from the teacher to the student via their proposed SSKD architecture, which substantially benefited the scenarios with few-shot and noisy-labels. Ding et al (Ding, Wang, Xu, Wang and Welch, 2021c) proposed an data augmentationbased knowledge distillation framework for classification and regression tasks via synthetic samples (Ding, Wang, Xu, Welch and Wang, 2021d;Ding, Wang, Wang and Welch, 2021b).…”
Section: Introductionmentioning
confidence: 99%