2020
DOI: 10.48550/arxiv.2007.02080
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

End-to-end Learning of a Fisher Vector Encoding for Part Features in Fine-grained Recognition

Dimitri Korsch,
Paul Bodesheim,
Joachim Denzler

Abstract: Part-based approaches for fine-grained recognition do not show the expected performance gain over global methods, although being able to explicitly focus on small details that are relevant for distinguishing highly similar classes. We assume that part-based methods suffer from a missing representation of local features, which is invariant to the order of parts and can handle a varying number of visible parts appropriately. The order of parts is artificial and often only given by ground-truth annotations, where… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 32 publications
0
4
0
Order By: Relevance
“…Recent top performing supervised learning methods have substantially shown that the most successful strategy to FGVC is obtained by identifying, either explicitly or implicitly, the object parts [14,30,52,92]. The central underlying assumption is that fine-grained information resides within the parts.…”
Section: Cnn Attention Mapsmentioning
confidence: 99%
See 2 more Smart Citations
“…Recent top performing supervised learning methods have substantially shown that the most successful strategy to FGVC is obtained by identifying, either explicitly or implicitly, the object parts [14,30,52,92]. The central underlying assumption is that fine-grained information resides within the parts.…”
Section: Cnn Attention Mapsmentioning
confidence: 99%
“…Global methods use the input image as a whole and employ different strategies for pre-training [10], augmentation [31,76], or pooling [41,69,70,98] to exploit the parts. In contrast, localization-based methods approaches apply sophisticated detection techniques in order to determine the regions of the parts [17,29,30,84,87,[93][94][95][96]. The two categories have shown comparable results.…”
Section: Fine Grained Visual Categorizationmentioning
confidence: 99%
See 1 more Smart Citation
“…Global methods use the input image as a whole and employ different strategies for pre-training [10], augmentation [31,76], or pooling [41,69,70,98] to exploit the parts. In contrast, localization-based methods approaches apply sophisticated detection techniques in order to determine the regions of the parts [17,29,30,84,87,[93][94][95][96]. The two categories have shown comparable results.…”
Section: Fine Grained Visual Categorizationmentioning
confidence: 99%