2021
DOI: 10.48550/arxiv.2106.02624
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Abstract: Curvature in form of the Hessian or its generalized Gauss-Newton (GGN) approximation is valuable for algorithms that rely on a local model for the loss to train, compress, or explain deep networks. Existing methods based on implicit multiplication via automatic differentiation or Kronecker-factored block diagonal approximations do not consider noise in the mini-batch. We present VIVIT, a curvature model that leverages the GGN's low-rank structure without further approximations. It allows for efficient computat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 7 publications
0
1
0
Order By: Relevance
“…The intrinsic low rank structure of the (empirical) Fisher has been exploited in a number of setups by a number of papers including (Agarwal et al, 2019;Goldfarb et al, 2020;Immer et al, 2021;Dangel et al, 2021).…”
Section: H1 Generally Related Workmentioning
confidence: 99%
“…The intrinsic low rank structure of the (empirical) Fisher has been exploited in a number of setups by a number of papers including (Agarwal et al, 2019;Goldfarb et al, 2020;Immer et al, 2021;Dangel et al, 2021).…”
Section: H1 Generally Related Workmentioning
confidence: 99%