2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.00902
|View full text |Cite
|
Sign up to set email alerts
|

Continual Learning With Extended Kronecker-Factored Approximate Curvature

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 34 publications
(14 citation statements)
references
References 15 publications
0
14
0
Order By: Relevance
“…Ritter et al [27] updated a quadratic penalty term in the loss function for every task by the block-diagonal Kronecker factored approximation of the Hessian matrix for considering the intra-layer parameter interaction simultaneously. Lee et al [28] pointed out that the current usage of the Hessian matrix approximation as the curvature of the quadratic penalty function is not effective for networks containing batch normalization layers. To resolve this issue, a Hessian approximation method, which considers the effects of batch normalization layers was proposed.…”
Section: Regularizing Loss Functionmentioning
confidence: 99%
“…Ritter et al [27] updated a quadratic penalty term in the loss function for every task by the block-diagonal Kronecker factored approximation of the Hessian matrix for considering the intra-layer parameter interaction simultaneously. Lee et al [28] pointed out that the current usage of the Hessian matrix approximation as the curvature of the quadratic penalty function is not effective for networks containing batch normalization layers. To resolve this issue, a Hessian approximation method, which considers the effects of batch normalization layers was proposed.…”
Section: Regularizing Loss Functionmentioning
confidence: 99%
“…After that, several normalization layers devised for various computer vision tasks have been proposed with their respective advantage (Ulyanov et al, 2016;Wu & He, 2018;Hoffer et al, 2017). Recently, appropriate use of BN in more diverse tasks, such as meta-learng (Bronskill et al, 2020), task-incremental learning (Lee et al, 2020) and online class-incremental learning (Anonymous, 2022), have been discussed.…”
Section: Related Workmentioning
confidence: 99%
“…This is because sharp task-specific minima lead to overspecialization to a particular task and consequently to a forgetting of all other tasks. Weights constraints as EWC [38] or second-order optimization [42] have similar motivations. SAM estimates the worst closest parameters during a first forward/backward pass, and then optimizes the loss w.r.t.…”
Section: A3 Novel Continual Training Proceduresmentioning
confidence: 99%