2020
DOI: 10.48550/arxiv.2011.11619
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Neural collapse with unconstrained features

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
15
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(17 citation statements)
references
References 0 publications
2
15
0
Order By: Relevance
“…We show that the solution of the variational problem is given by a simplex equiangular tight frame (ETF). This proves the the neural collapse behavior for (2), which provides some justification to the observation of such behavior in deep learning.…”
Section: Introduction and Resultssupporting
confidence: 69%
See 2 more Smart Citations
“…We show that the solution of the variational problem is given by a simplex equiangular tight frame (ETF). This proves the the neural collapse behavior for (2), which provides some justification to the observation of such behavior in deep learning.…”
Section: Introduction and Resultssupporting
confidence: 69%
“…If d ≥ n − 1, the global minimum of the problem corresponds to the case where {u i } n i=1 form a simplex equiangular tight frame and u i = v i for all i = 1, • • • , n. We remark that similar results have been proved for different loss functions: for a large deviation type loss function in [3] and for a L 2 -loss function in [2], both for models with unconstrained feature vectors (i.e., without neural network parametrization of u i 's). To the best of our knowledge, Theorem 1 is the first justification for the cross-entropy loss, which is considered in the large-scale experiments in [3] and also perhaps the most popular choice for classification problems.…”
Section: Introduction and Resultssupporting
confidence: 57%
See 1 more Smart Citation
“…Moreover, the last layer's weights {w k } are also aligned (i.e., equal up to a scalar factor) to the same simplex ETF, and as a result, the classification turns to be based on the nearest class center in feature space. This "neural collapse" (NC) behavior has led to many follow-up papers (Mixon et al, 2020;Lu & Steinerberger, 2022;Wojtowytsch et al, 2021;Fang et al, 2021;Zhu et al, 2021;Graf et al, 2021;Ergen & Pilanci, 2021;Zarka et al, 2021). Some of them include practical implications of the NC phenomenon, such as designing layers (multiplication by tight frames followed by soft-thresholding) that concentrate within-class features (Zarka et al, 2021) or fixing the last layer's weights to be a simplex ETF (Zhu et al, 2021).…”
Section: Background and Related Workmentioning
confidence: 99%
“…The empirical work in (Papyan et al, 2020) has been followed by papers that theoretically examined the emergence of collapse to simplex ETFs in simplified mathematical frameworks. Starting from (Mixon et al, 2020), most of these papers (e.g., (Lu & Steinerberger, 2022;Wojtowytsch et al, 2021;Fang et al, 2021;Zhu et al, 2021)) consider the "unconstrained features model" (UFM), where the features of the training data after the penultimate layer are treated as free optimization variables (disconnected from the samples). The rationale behind this model is that modern deep networks are extremely overparameterized and expressive such that their feature mapping can be adapted to any training data (e.g., even to noise (Zhang et al, 2021)).…”
Section: Introductionmentioning
confidence: 99%