2022
DOI: 10.1007/978-3-031-19809-0_14
|View full text |Cite
|
Sign up to set email alerts
|

DisCo: Remedying Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 22 publications
(12 citation statements)
references
References 17 publications
0
12
0
Order By: Relevance
“…Importantly, RoB removes the regularisation terms that aim at preventing collapse from the loss, and use identical-view predictions instead of cross-view predictions in the loss. [42], DisCo [14] or SimReg [30], have noticed that jointembedding self-supervised learning methods such as SwAV [8], MoCo [11,21] or DINO [9] suffer from a drop in performance when applied on low compute neural nets. These works have proposed to use Knowledge Distillation [23] to circumvent those difficulties.…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…Importantly, RoB removes the regularisation terms that aim at preventing collapse from the loss, and use identical-view predictions instead of cross-view predictions in the loss. [42], DisCo [14] or SimReg [30], have noticed that jointembedding self-supervised learning methods such as SwAV [8], MoCo [11,21] or DINO [9] suffer from a drop in performance when applied on low compute neural nets. These works have proposed to use Knowledge Distillation [23] to circumvent those difficulties.…”
Section: Related Workmentioning
confidence: 99%
“…CompRess [27] and SEED [13] use a memory queue like MoCo [21] to distill the knowledge of the teacher by minimizing the cross-entropy between the probability distribution of the teacher and student obtained by comparing a sample to each point in the queue. DisCo [14] and BINGO [42] makes use of contrastive learning, with BINGO additionally grouping samples into cluster of related samples. Finally, SimReg [30] proposes regression as a generic way to transfer feature representation from a teacher to a student.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations