2021
DOI: 10.48550/arxiv.2106.04990
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

It Takes Two to Tango: Mixup for Deep Metric Learning

Abstract: Metric learning involves learning a discriminative representation such that embeddings of similar classes are encouraged to be close, while embeddings of dissimilar classes are pushed far apart. State-of-the-art methods focus mostly on sophisticated loss functions or mining strategies. On the one hand, metric learning losses consider two or more examples at a time. On the other hand, modern data augmentation methods for classification consider two or more examples at a time. The combination of the two ideas is… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(10 citation statements)
references
References 40 publications
0
5
0
Order By: Relevance
“…Consequently, we introduce metric-learning regularization terms in the original problem (equation ( 6)), which we call FP-Metric. Metric-learning (Hoffer and Ailon 2015; Kaya and Bilge 2019) is well-known approach to learn appropriate representation via FP and positive samples in computer vision (Karpusha, Yun, and Fehervari 2020;Venkataramanan et al 2021) and audio (Chung et al 2020;Xu et al 2020) domains. Also, metric-learning is getting great attention recently due to high-performance in self-supervised and unsupervised approaches (Jaiswal et al 2021).…”
Section: Theoretical Analysismentioning
confidence: 99%
“…Consequently, we introduce metric-learning regularization terms in the original problem (equation ( 6)), which we call FP-Metric. Metric-learning (Hoffer and Ailon 2015; Kaya and Bilge 2019) is well-known approach to learn appropriate representation via FP and positive samples in computer vision (Karpusha, Yun, and Fehervari 2020;Venkataramanan et al 2021) and audio (Chung et al 2020;Xu et al 2020) domains. Also, metric-learning is getting great attention recently due to high-performance in self-supervised and unsupervised approaches (Jaiswal et al 2021).…”
Section: Theoretical Analysismentioning
confidence: 99%
“…Such a large batch is used for the first time in several of the employed image retrieval benchmarks. The batch size is virtually increased with another contribution, which is a computationally efficient mixup technique, called SiMix, that operates on the similarity scores instead of the embedding vectors as in prior work [55,13,12]. If the training set is not large enough and all of its classes are used to form a single batch, SiMix is shown essential to virtually increase the batch size and significantly boost the performance.…”
Section: Query Ranked Database Imagesmentioning
confidence: 99%
“…Linearly interpolating labels entails the risk of generating false negatives if the interpolation factor is close to 0 or 1. Such limitations are overcome in the work of Venkataramanan et al [55], which generalizes mixing examples from different classes for pairwise loss functions. The proposed SiMix approach differs from the aforementioned techniques as it operates on the similarity scores instead of the embedding vectors, making it computationally efficient.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Note that architectures, optimizers, and pooling layer differ slightly between the different implementations. For SOP and Landmarks, we use ResNet-18, GeM pooling, and SGD, [38]-R-GeM and [40] use ResNet-101, GeM pooling, and Adam, [33] uses BN-Inception, a linear projection, and RMSProp, [34] use Google LeNet, a linear projection, and SGD, and [47] use a Resnet50, a combination of average and max pooling followed by a linear projection, and AdamW. For TinyImageNet, we use ResNet-32 and SGD, and [36] uses ResNet-32 and Adam.…”
Section: D1 Switching the Task And The Lossmentioning
confidence: 99%