2021
DOI: 10.48550/arxiv.2106.11230
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Can contrastive learning avoid shortcut solutions?

Abstract: The generalization of representations learned via contrastive learning depends crucially on what features of the data are extracted. However, we observe that the contrastive loss does not always sufficiently guide which features are extracted, a behavior that can negatively impact the performance on downstream tasks via "shortcuts", i.e., by inadvertently suppressing important predictive features. We find that feature extraction is influenced by the difficulty of the so-called instance discrimination task (i.e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(10 citation statements)
references
References 11 publications
0
10
0
Order By: Relevance
“…It provides a principled way to finding more discriminative negatives for an effective training of the encoder. Along this line, an implicit feature modification (IFM) [17] is proposed to iteratively update the hard samples. The IFM also pushes the samples towards the representation of anchors like the AdCo, but its step size is set by a given level of budget.…”
Section: Related Workmentioning
confidence: 99%
“…It provides a principled way to finding more discriminative negatives for an effective training of the encoder. Along this line, an implicit feature modification (IFM) [17] is proposed to iteratively update the hard samples. The IFM also pushes the samples towards the representation of anchors like the AdCo, but its step size is set by a given level of budget.…”
Section: Related Workmentioning
confidence: 99%
“…However, neither of these approaches is considered in the image retrieval setup. [50] propose adversarial training in the latent space with the goal of improving standard generalization. Closer to image attribution, Panum et al [47] combine deep metric learning algorithms with adversarial training but they perform only small-scale experiments and evaluate their models via nearest neighbour classification which is different from the retrieval under editorial and non-editorial distortions as we do in the context of image attribution.…”
Section: Related Workmentioning
confidence: 99%
“…How well self-supervised representations generalize to different settings, after training, is often assessed using a down-stream evaluation task, such as object classification [6] or speaker identification [40] Shortcut feature representations. Robinson et al [49] show that the features learned by the InfoNCE [40] loss depend on the difficulty of instance discrimination during training. If the instance discrimination task is easy to solve during training, the model will learn shortcut features.…”
Section: Generalizability Of Contrastive Lossesmentioning
confidence: 99%
“…How well an ICR method generalizes beyond the specific training setup depends on the features that the method has learned during training. Contrastive loss functions are prone to learning shortcuts [30,49], which are rules that perform well on standard benchmarks but fail to generalize to other testing conditions [13]. In the context of ICR, a shortcut is a latent representation of either the image or the caption that does not contain all the aspects mentioned in a caption that describes that scene.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation