2022
DOI: 10.48550/arxiv.2201.08924
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Nearest Class-Center Simplification through Intermediate Layers

Abstract: Recent advances in theoretical Deep Learning have introduced geometric properties that occur during training, past the Interpolation Thresholdwhere the training error reaches zero. We inquire into the phenomena coined Neural Collapse in the intermediate layers of the networks, and emphasize the innerworkings of Nearest Class-Center Mismatch inside the deepnet. We further show that these processes occur both in vision and language model architectures. Lastly, we propose a Stochastic Variability-Simplification L… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 13 publications
0
1
0
Order By: Relevance
“…The work [19] shows that N C also happens on test data drawn from the same distribution asymptotically, but less collapse for finite samples [29]. Other works [29,49] demonstrated that the variability collapse of features is actually happening progressively from shallow to deep layers, and [2] showed that test performance can be improved when enforcing variability collapse on features of intermediate layers. The works [78,79] showed that fixing the classifier as a simplex ETF improves test performance on imbalanced training data and long-tailed classification problems.…”
Section: Motivations and Contributionsmentioning
confidence: 99%
“…The work [19] shows that N C also happens on test data drawn from the same distribution asymptotically, but less collapse for finite samples [29]. Other works [29,49] demonstrated that the variability collapse of features is actually happening progressively from shallow to deep layers, and [2] showed that test performance can be improved when enforcing variability collapse on features of intermediate layers. The works [78,79] showed that fixing the classifier as a simplex ETF improves test performance on imbalanced training data and long-tailed classification problems.…”
Section: Motivations and Contributionsmentioning
confidence: 99%