Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence 2019
DOI: 10.24963/ijcai.2019/573
|View full text |Cite
|
Sign up to set email alerts
|

Amalgamating Filtered Knowledge: Learning Task-customized Student from Multi-task Teachers

Abstract: Many well-trained Convolutional Neural Network (CNN) models have now been released online by developers for the sake of effortless reproducing. In this paper, we treat such pre-trained networks as teachers, and explore how to learn a target student network for customized tasks, using multiple teachers that handle different tasks. We assume no human-labelled annotations are available, and each teacher model can be either single-or multitask network, where the former is a degenerated case of the latter. The stud… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
21
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
2
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 25 publications
(21 citation statements)
references
References 6 publications
0
21
0
Order By: Relevance
“…In order to handle the multi-task problem in one single network, the work of [33] proposes an effective method to train the student network on multiple scene understanding tasks, which leads to better performance than the teachers. To make it further, Ye et al [35] apply a two-step filter strategy to customize the arbitrary task set on TargetNet.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…In order to handle the multi-task problem in one single network, the work of [33] proposes an effective method to train the student network on multiple scene understanding tasks, which leads to better performance than the teachers. To make it further, Ye et al [35] apply a two-step filter strategy to customize the arbitrary task set on TargetNet.…”
Section: Related Workmentioning
confidence: 99%
“…Motivated by the work of [35], we use the block-wise training strategy to transfer as much knowledge as possible from the generator G into the dual-generator T . That is, we divide dual-generator into B blocks as {T Then the task-level filtering g m is applied to meet the task customizable demand and embedded into the m-th branch block-wise adversarial loss:…”
Section: Dual-generator Trainingmentioning
confidence: 99%
See 1 more Smart Citation
“…It adopts an auto- encoder architecture to amalgamate features from multiple single-task teachers. Several knowledge amalgamation methods are also proposed to handle the above task [33,23,34]. The proposed approach here, on the other hand, handles teachers working on both single or multiple tasks, and follows a dual-stage strategy tailored for customizing the student network that also gives rise to component nets as byproducts.…”
Section: Related Workmentioning
confidence: 99%
“…Various techniques have also been proposed to improve the generalisability and the performance of the student models including using noisy data (Li et al, 2017; Sarfraz et al, 2019), adaptive regularisation of distillation parameters (Ding et al, 2019) and adversarial perturbation of data for training (Xie et al, 2020). Multi-task learning methods have also been shown to provide a good regularisation, reducing the risk of over-fitting (Liu et al, 2019a; Ye et al, 2019). The auxiliary task could be a related task (e.g.…”
Section: Introductionmentioning
confidence: 99%