2023
DOI: 10.1109/tii.2022.3209672
|View full text |Cite
|
Sign up to set email alerts
|

Multilevel Attention-Based Sample Correlations for Knowledge Distillation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 59 publications
(9 citation statements)
references
References 26 publications
0
9
0
Order By: Relevance
“…In this section, we first compared CSKD with some previous methods, including KD, 12 FitNet, 34 AT, 35 SP, 36 VID, 38 HKD, 39 MGD, 41 CRD, 40 virtual knowledge distillation (VKD), 42 curriculum expert selection for knowledge distillation (CESKD), 43 and multilevel attention-based sample correlations for knowledge distillation (MASCKD), 44 on benchmark datasets to verify its effectiveness. Then, we conducted representational transferability experiments to evaluate the quality of representations learned by the student network.…”
Section: Methodsmentioning
confidence: 99%
“…In this section, we first compared CSKD with some previous methods, including KD, 12 FitNet, 34 AT, 35 SP, 36 VID, 38 HKD, 39 MGD, 41 CRD, 40 virtual knowledge distillation (VKD), 42 curriculum expert selection for knowledge distillation (CESKD), 43 and multilevel attention-based sample correlations for knowledge distillation (MASCKD), 44 on benchmark datasets to verify its effectiveness. Then, we conducted representational transferability experiments to evaluate the quality of representations learned by the student network.…”
Section: Methodsmentioning
confidence: 99%
“…By setting the derivative of objective function in (25) with respect to a (k) to zero, the optimal solution of problem ( 25) is as follows:…”
Section: 24mentioning
confidence: 99%
“…A robust unsupervised spectral feature selection model [23] incorporated the graph matrix construction and the feature selection into the process of data mining. In addition, the sample correlation is also conducive to the improvement of learning performance [24][25][26]. The most classical method is a low-rank representation (LRR) [27,28], which can capture the globality and correlation of all training data.…”
Section: Introductionmentioning
confidence: 99%
“…Since the imaging process is mapping from many to one, it would lead to the uncertainties and inaccuracy from image itself. Although more and more data-driven methods can used to analyse this mapping, it would need a lot of training data [6,7]. Here, we use the fuzzy sets to elaborate the uncertain information of the image [8].…”
Section: Introductionmentioning
confidence: 99%