2021
DOI: 10.48550/arxiv.2110.04947
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards Demystifying Representation Learning with Non-contrastive Self-supervision

Abstract: Non-contrastive methods of self-supervised learning (such as BYOL and SimSiam) learn representations by minimizing the distance between two views of the same image. These approaches have achieved remarkable performance in practice, but it is not well understood 1) why these methods do not collapse to the trivial solutions and 2) how the representation is learned. Tian et al. ( 2021) made an initial attempt on the first question and proposed DirectPred that sets the predictor directly. In our work, we analyze a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
1

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(10 citation statements)
references
References 21 publications
0
9
1
Order By: Relevance
“…In this section, we will clarify the differences between our results and some similar studies. Especially the theoretical papers by Tian et al [82] and Wang et al [87]. Pokle et al [67] compared the landscapes between contrastive and non-contrastive learning and points out the existence of non-collapsed bad minima for non-contrastive learning without a prediction head.…”
Section: Comparison To Similar Studiesmentioning
confidence: 99%
See 4 more Smart Citations
“…In this section, we will clarify the differences between our results and some similar studies. Especially the theoretical papers by Tian et al [82] and Wang et al [87]. Pokle et al [67] compared the landscapes between contrastive and non-contrastive learning and points out the existence of non-collapsed bad minima for non-contrastive learning without a prediction head.…”
Section: Comparison To Similar Studiesmentioning
confidence: 99%
“…In the paper [82], experiments over the STL-10 dataset showed that the linear prediction head tends to converge to a symmetric matrix during training. And the follow-up paper [87] established a theory under the symmetric prediction head (which is not trained but manually set at each iteration). However, similar to the reason why eigenspace alignment cannot fully explain the effects of the prediction head, the symmetric prediction head given in [87] might not explain the trainable prediction head as well.…”
Section: Can Eigenspace Alignment Explain the Effects Of Training The...mentioning
confidence: 99%
See 3 more Smart Citations