2022
DOI: 10.48550/arxiv.2203.15702
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Contrasting the landscape of contrastive and non-contrastive learning

Abstract: A lot of recent advances in unsupervised feature learning are based on designing features which are invariant under semantic data augmentations. A common way to do this is contrastive learning, which uses positive and negative samples. Some recent works however have shown promising results for noncontrastive learning, which does not require negative samples. However, the non-contrastive losses have obvious "collapsed" minima, in which the encoders output a constant feature embedding, independent of the input. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 11 publications
0
6
0
Order By: Relevance
“…Various theoretical studies have also investigated non-contrastive methods for self-supervised learning [71,61,18,5,66,49,57,34]. Garrido et al [18] establishes the duality between contrastive and non-contrastive methods.…”
Section: Related Workmentioning
confidence: 99%
“…Various theoretical studies have also investigated non-contrastive methods for self-supervised learning [71,61,18,5,66,49,57,34]. Garrido et al [18] establishes the duality between contrastive and non-contrastive methods.…”
Section: Related Workmentioning
confidence: 99%
“…The fact that a method like SimSiam does not collapse is studied in [29]. The loss landscape of SimSiam is also compared to SimCLR's in [26], which shows that it learns bad minima. In [31], the optimal solutions of the InfoNCE criterion are characterized, giving a better understanding of the embedding distributions.…”
Section: Related Workmentioning
confidence: 99%
“…Especially the theoretical papers by Tian et al [82] and Wang et al [87]. Pokle et al [67] compared the landscapes between contrastive and non-contrastive learning and points out the existence of non-collapsed bad minima for non-contrastive learning without a prediction head.…”
Section: Comparison To Similar Studiesmentioning
confidence: 99%
“…There are already some theoretical papers [82,87,67] that try to address similar questions. While none of these papers studied the training process of the prediction head, our results provide a completely different perspective: We explain why training the prediction head can encourage the network to learn diversified features and avoid dimensional collapses, even when the trivial collapsed optima still exist in the training objective, which is not covered by the prior works.…”
Section: Introductionmentioning
confidence: 99%