2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.00977
|View full text |Cite
|
Sign up to set email alerts
|

Weakly Supervised Fine-Grained Image Classification via Guassian Mixture Model Oriented Discriminative Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
33
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 72 publications
(33 citation statements)
references
References 27 publications
0
33
0
Order By: Relevance
“…Using the same backbone, our method gains a clear margin on Aircraft (0.8%), Flowers (1.0%) and Pets (1.1%). We also achieve the SotA results compared to the others based on the attention mechanism [4,8,42,62], GCN [55,60,84], and low-rank discriminative bases [58]. Moreover, our approach gives superior performance over these existing methods (marked * in Table 1) without leveraging secondary datasets.…”
Section: Fine-grained Image Classificationmentioning
confidence: 79%
See 1 more Smart Citation
“…Using the same backbone, our method gains a clear margin on Aircraft (0.8%), Flowers (1.0%) and Pets (1.1%). We also achieve the SotA results compared to the others based on the attention mechanism [4,8,42,62], GCN [55,60,84], and low-rank discriminative bases [58]. Moreover, our approach gives superior performance over these existing methods (marked * in Table 1) without leveraging secondary datasets.…”
Section: Fine-grained Image Classificationmentioning
confidence: 79%
“…For a higher number of regions, it fails to learn spatial contextual information resulting in lower accuracy. Likewise, DFG [58] also suffers from the same scalability problem in their graph-structure. Our model is scalable to any number of regions without increasing the model parameters since the graph node's parameters are shared.…”
Section: Fine-grained Image Classificationmentioning
confidence: 99%
“…We determine the merged local region L o as the detected object localization. It is shown that the larger δ h is, the more local regions are needed to cover the object, and the easier it is to obtain better lo- ResNet-50 87.3 --DFL-CNN [31] ResNet-50 × 87.4 92.0 93.8 NTS-Net [15] ResNet-50 × 87.5 91.4 93.9 TASN [17] ResNet-50 × 87.9 -93.8 DF-GMM [32] ResNet-50 calization performance. Under δ h = 0.6, the mIoU performance of n = 3 is lower than that of n = 2, which is consistent with the retrieval (in Table 4) and classification performances.…”
Section: Localizationmentioning
confidence: 99%
“…In real-world applications, it is infeasible to annotate a large amount of training data. Therefore, unsupervised representation learning has been widely studied in many computer vision tasks like image classification [1,13,14,20,38,26,48], image retrieval [21], and object detection [19,31,4,40]. Recently, self-supervised learning methods [2,15,18,30,36,41,54] were in favor for unsupervised pre-training tasks, where a contrastive loss was adopted to learn instance discriminative representations.…”
Section: Related Workmentioning
confidence: 99%