Classification is a Strong Baseline for Deep Metric Learning

Zhai, Andrew; Wu, Haoyu

doi:10.48550/arxiv.1811.12649

Cited by 46 publications

(55 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In most cases, the training process consists of multiplying weight matrix with embedding vectors to obtain logits, and then applying a certain loss function to the logits. The most straightforward one is the normalized softmax loss [90,141,165]. It is identical with the cross entropy loss with L2-normalized columns of the weight matrix.…”

Section: Supervision For Metric Learning A) Full Supervisionmentioning

confidence: 99%

Bridging Gap between Image Pixels and Semantics via Supervision: A Survey

Duan¹,

Kuo²

2021

Preprint

View full text Add to dashboard Cite

The fact that there exists a gap between low-level features and semantic meanings of images, called the semantic gap, is known for decades. Resolution of the semantic gap is a long standing problem. The semantic gap problem is reviewed and a survey on recent efforts in bridging the gap is made in this work. Most importantly, we claim that the semantic gap is primarily bridged through supervised learning today. Experiences are drawn from two application domains to illustrate this point: 1) object detection and 2) metric learning for content-based image retrieval (CBIR). To begin with, this paper offers a historical retrospective on supervision, makes a gradual transition to the modern data-driven methodology and introduces commonly used datasets. Then, it summarizes various supervision methods to bridge the semantic gap in the context of object detection and metric learning.

show abstract

Section: Supervision For Metric Learning A) Full Supervisionmentioning

confidence: 99%

Bridging Gap between Image Pixels and Semantics via Supervision: A Survey

Duan¹,

Kuo²

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The most widely used classification loss function, softmax loss, has been revalued as a competitive objective function in metric learning [48,2]. The softmax loss is used to optimize the network f and class weight W :…”

Section: Preliminarymentioning

confidence: 99%

Learning with Memory-based Virtual Classes for Deep Metric Learning

Ko¹,

Gu²,

Kim³

2021

Preprint

View full text Add to dashboard Cite

The core of deep metric learning (DML) involves learning visual similarities in high-dimensional embedding space. One of the main challenges is to generalize from seen classes of training data to unseen classes of test data. Recent works have focused on exploiting past embeddings to increase the number of instances for the seen classes. Such methods achieve performance improvement via augmentation, while the strong focus on seen classes still remains. This can be undesirable for DML, where training and test data exhibit entirely different classes. In this work, we present a novel training strategy for DML called MemVir. Unlike previous works, MemVir memorizes both embedding features and class weights to utilize them as additional virtual classes. The exploitation of virtual classes not only utilizes augmented information for training but also alleviates a strong focus on seen classes for better generalization. Moreover, we embed the idea of curriculum learning by slowly adding virtual classes for a gradual increase in learning difficulty, which improves the learning stability as well as the final performance. MemVir can be easily applied to many existing loss functions without any modification. Extensive experimental results on famous benchmarks demonstrate the superiority of MemVir over state-of-the-art competitors. Code of MemVir will be publicly available.

show abstract

“…Recall@K (%) For CUB200 and CARS196, cropped images with bounding box information are used. We follow the same training and test split as [9,24,60] for fair comparisons.…”

Section: Trickmentioning

confidence: 99%

Combination of Multiple Global Descriptors for Image Retrieval

Jun¹,

Ko²,

Kim³

et al. 2019

Preprint

View full text Add to dashboard Cite

Recent studies in image retrieval task have shown that ensembling different models and combining multiple global descriptors lead to performance improvement. However, training different models for the ensemble is not only difficult but also inefficient with respect to time and memory. In this paper, we propose a novel framework that exploits multiple global descriptors to get an ensemble effect while it can be trained in an end-to-end manner. The proposed framework is flexible and expandable by the global descriptor, CNN backbone, loss, and dataset. Moreover, we investigate the effectiveness of combining multiple global descriptors with quantitative and qualitative analysis. Our extensive experiments show that the combined descriptor outperforms a single global descriptor, as it can utilize different types of feature properties. In the benchmark evaluation, the proposed framework achieves the state-of-theart performance on the CARS196, CUB200-2011, In-shop Clothes, and Stanford Online Products on image retrieval tasks. Our model implementations and pretrained models are publicly available 1 .

show abstract

Classification is a Strong Baseline for Deep Metric Learning

Cited by 46 publications

References 0 publications

Bridging Gap between Image Pixels and Semantics via Supervision: A Survey

Bridging Gap between Image Pixels and Semantics via Supervision: A Survey

Learning with Memory-based Virtual Classes for Deep Metric Learning

Combination of Multiple Global Descriptors for Image Retrieval

Contact Info

Product

Resources

About