2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015
DOI: 10.1109/cvpr.2015.7298995
|View full text |Cite
|
Sign up to set email alerts
|

Fine-grained visual categorization via multi-stage metric learning

Abstract: Fine-grained visual categorization (FGVC) is to categorize objects into subordinate classes instead of basic classes. One major challenge in FGVC is the co-occurrence of two issues: 1) many subordinate classes are highly correlated and are difficult to distinguish, and 2) there exists the large intra-class variation (e.g., due to object pose). This paper proposes to explicitly address the above two issues via distance metric learning (DML). DML addresses the first issue by learning an embedding so that data po… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
83
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 123 publications
(83 citation statements)
references
References 24 publications
0
83
0
Order By: Relevance
“…To alleviate this issue, we develop a strategy that sets a sufficiently large number of centers for each class at the beginning and then applies L 2,1 norm to obtain a compact set of centers. We demonstrate the proposed loss on the fine-grained visual categorization tasks, where capturing local clusters is essential for good performance [17].…”
Section: Introductionmentioning
confidence: 99%
“…To alleviate this issue, we develop a strategy that sets a sufficiently large number of centers for each class at the beginning and then applies L 2,1 norm to obtain a compact set of centers. We demonstrate the proposed loss on the fine-grained visual categorization tasks, where capturing local clusters is essential for good performance [17].…”
Section: Introductionmentioning
confidence: 99%
“…Reasoning about the similarity between images or data of different modalities is an inherent challenge in computer vision. Beyond its prevalence in fundamental problems such as image-sentence retrieval [41,38], cross-domain image-matching [32,16], attribution learning [4,33] and visual categorization [29], it also has an increasingly prominent role in computer vision problems in the fashion and retail domains like outfit style modeling [14], fashion item retrieval and recommendation [10,22] and automatic capsule wardrobe generation [15]. Metric learning (the task of learning a distance function between features based on supervised similar/dissimilar pairs) is a common approach 1 https://github.com/rxtan2/ Learning-Similarity-Conditions Figure 1: We propose the SCE-Net model for learning multi-faceted similarity between images, such as compatibility of two fashion items.…”
Section: Introductionmentioning
confidence: 99%
“…Besides, some approaches (e.g. [28], [37]) try to learn more robust representations via distance metric learning. [38] unifies deep CNN features with spatially weighted Fisher vectors to capture important details and eliminate background disturbance.…”
Section: Representation Learningmentioning
confidence: 99%