2021 IEEE Winter Conference on Applications of Computer Vision Workshops (WACVW) 2021
DOI: 10.1109/wacvw52041.2021.00019
|View full text |Cite
|
Sign up to set email alerts
|

Domain Adaptive Knowledge Distillation for Driving Scene Semantic Segmentation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(10 citation statements)
references
References 23 publications
0
10
0
Order By: Relevance
“…where q s i is the prediction map of the current model M t on the current input x t for the previous task t − 1; q t i is the prediction map of the previous model M t−1 on the current input x t for the previous task t − 1; ϕ is the KL-divergence (KLD) loss between these two softmax probability distribution maps, computed and summed over each previously learned task i, 0 < i < t; λ KLD is the regularization hyperparameter for KL-divergence. KLD here effectively distills domain knowledge from the teacher q t i to the student q s i , and can be seen as domain adaptive knowledge distillation [22]. Total loss for domain-invariant parameters W s is hence given as:…”
Section: Domain-invariant Parametersmentioning
confidence: 99%
“…where q s i is the prediction map of the current model M t on the current input x t for the previous task t − 1; q t i is the prediction map of the previous model M t−1 on the current input x t for the previous task t − 1; ϕ is the KL-divergence (KLD) loss between these two softmax probability distribution maps, computed and summed over each previously learned task i, 0 < i < t; λ KLD is the regularization hyperparameter for KL-divergence. KLD here effectively distills domain knowledge from the teacher q t i to the student q s i , and can be seen as domain adaptive knowledge distillation [22]. Total loss for domain-invariant parameters W s is hence given as:…”
Section: Domain-invariant Parametersmentioning
confidence: 99%
“…In addition to the centroid-aware methods, there are also many trivial solutions for domain adaptation, e.g., Knowledge Distillation [54], [60] and Mixing [59], [61]. These popular methods indeed boost the performance to a record high, yet the training process is prone to be time-consuming and requires much computational resources, making unsupervised domain adaptation impractical in industrialized scenarios.…”
Section: Discussionmentioning
confidence: 99%
“…For the fine-grained feature alignment, these methods focus on reducing the distance between the corresponding classes of two domains. In addition, there are also many effective solutions for domain adaptation, e.g., Knowledge Distillation [54], [60] and Mixing [59], [61]. The proposal of these methods leads to considerable progress on major open benchmark datasets.…”
Section: B Unsupervised Domain Adaptationmentioning
confidence: 99%
“…Furthermore, taking advantage of the intrinsic spatial structure presented in urban scene images (that they focus on), they propose to partition the images into non-overlapping grids, and the domain alignment is performed on the pixel-level features from the same spatial region using GRL [50]. Finally, The Domain Adaptive Knowledge Distillation model [88] consists in a multi-level strategy to effectively distill knowledge at different levels -feature space and output spaceusing a combination of KL divergence and MSE losses.…”
Section: Entropy Minimization Of Target Predictions (Tem)mentioning
confidence: 99%