2020
DOI: 10.1109/tpami.2018.2848925
|View full text |Cite
|
Sign up to set email alerts
|

Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

Abstract: Learning similarity functions between image pairs with deep neural networks yields highly correlated activations of embeddings. In this work, we show how to improve the robustness of such embeddings by exploiting the independence within ensembles. To this end, we divide the last embedding layer of a deep network into an embedding ensemble and formulate the task of training this ensemble as an online gradient boosting problem. Each learner receives a reweighted training sample from the previous learners. Furthe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
95
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 133 publications
(96 citation statements)
references
References 48 publications
1
95
0
Order By: Relevance
“…network representation is powerful enough to approximate arbitrary continuous functions [19], in practice this often leads to poor local minima and overfitting. This is partially due to an inefficient usage of the embedding space [34,36] and an attempt to directly fit a single distance metric to all available data [38,28,29].…”
Section: Introductionmentioning
confidence: 99%
“…network representation is powerful enough to approximate arbitrary continuous functions [19], in practice this often leads to poor local minima and overfitting. This is partially due to an inefficient usage of the embedding space [34,36] and an attempt to directly fit a single distance metric to all available data [38,28,29].…”
Section: Introductionmentioning
confidence: 99%
“…In summary our method is only outperformed by attention based ensembles like ABE [13]. As mentioned before, HDC, A-BIER [19], ABE-8 and Proxy-NCA [18] use more complex/multiple network architectures. These networks are applicable in combination with our loss function and will lead to improved retrieval results.…”
Section: Comparison To the State-of-the-artmentioning
confidence: 63%
“…Comparison with State-of-the-art: In order to highlight the significance of our decoupling idea for zero-shot retrieval, we compare DeML with some remarkable embed- From these tables, one can observe that our baseline(U512) can only achieve the general performances, and the performances of traditional ideas of loss-designing and samples-mining are almost on par with each other, while, by explicitly intensifying both the discrimination and diversity within the deep metric, our DeML can significantly improve the performances over baseline model and outperforms other state-of-the-art methods by a noteworthy margin, demonstrating the necessity of our explicit enhancement for discrimination and generalization via the decoupling idea. Moreover, different from the listed ensemble methods [43,24,25,13], DeML has clear objects of jointly mitigating the aforementioned issues that are vital to ZSIR, and thus can easily surpass them. Worthy of mention is that, we find DeML(I=2,J=4) is enough for Stanford Online Products dataset, since as in Fig.4 the second scale has been capable of localizing the discriminative regions.…”
Section: Resultsmentioning
confidence: 99%
“…Yuan et al [43] employ multiple layers at different depths for hard-aware samples mining and then cascade the learned embeddings together. Opitz et at [24,25] adopt the online gradients boosting and optimize different learners with the reweighted data. Kim et at [13] try to increase the feature diversity via contrastive loss but ignore the importance of learning discriminative metric in ZSIR task.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation