2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016
DOI: 10.1109/cvpr.2016.128
|View full text |Cite
|
Sign up to set email alerts
|

Picking Deep Filter Responses for Fine-Grained Image Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
220
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 323 publications
(222 citation statements)
references
References 18 publications
2
220
0
Order By: Relevance
“…[28], [37]) try to learn more robust representations via distance metric learning. [38] unifies deep CNN features with spatially weighted Fisher vectors to capture important details and eliminate background disturbance. [25] incorporates deep CNNs into a generic boosting framework to combine the strength of multiple weaker learners, which improves the classification accuracy of a single model and simplifies the network design.…”
Section: Representation Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…[28], [37]) try to learn more robust representations via distance metric learning. [38] unifies deep CNN features with spatially weighted Fisher vectors to capture important details and eliminate background disturbance. [25] incorporates deep CNNs into a generic boosting framework to combine the strength of multiple weaker learners, which improves the classification accuracy of a single model and simplifies the network design.…”
Section: Representation Learningmentioning
confidence: 99%
“…In general, RLA tries to attend and amplify the key region for capturing detailed visual representation while avoiding background disturbance. On the other hand, PL usually first localizes discriminative parts via some sophisticated part selection mechanisms such as part attentions [9,32,39] and convolutional responses [33,38], and then extracts the visual representations of the selected parts by using multiple independent feature extractors. Fig.…”
Section: Introductionmentioning
confidence: 99%
“…This is because expert human annotations can be cumbersome to obtain and are often error-prone [35]. More recent research has therefore concentrated on realizing parts in an unsupervised fashion [7], [9], [22], [30], [34], [41], [49]. These approaches have been shown to yield performances on par or even exceeding those that relied on manual annotations, owing to their ability of mining discriminative parts that are otherwise missing or inaccurate in human labelled data.…”
Section: Introductionmentioning
confidence: 99%
“…In this paper, we follow the same motivation as above [38], [49], [50] to address the unique challenges of fine-grained classification. We importantly differ in that we do not attempt to introduce any explicit network components for discriminate part discovery.…”
Section: Introductionmentioning
confidence: 99%
“…It is not sufficient for common users to find very fine-grained and personalized photos. For example, tagging different bird species [76,77], flower types [78,79], car models [80,81], or even human sentiment [82] have attracted extensive attention. This task is very challenging as some fine-grained categories (e.g., eared grebe, and horned grebe) can only be recognized by domain experts.…”
Section: D) Photo App On Ios 10mentioning
confidence: 99%