Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence 2020
DOI: 10.24963/ijcai.2020/152
|View full text |Cite
|
Sign up to set email alerts
|

Multi-attention Meta Learning for Few-shot Fine-grained Image Recognition

Abstract: The goal of few-shot image recognition is to distinguish different categories with only one or a few training samples. Previous works of few-shot learning mainly work on general object images. And current solutions usually learn a global image representation from training tasks to adapt novel tasks. However, fine-gained categories are distinguished by subtle and local parts, which could not be captured by global representations effectively. This may hinder existing few-shot learning approaches from dea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
59
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 80 publications
(59 citation statements)
references
References 16 publications
0
59
0
Order By: Relevance
“…Network architectures. Following the previous works [9], we adopt the standard feature extraction network Conv4. This contains four convolutional modules, each of which consists of a convolutional layer with 3×3 size followed by batch normalization layer and ReLU layer.…”
Section: Subtle Difference Miningmentioning
confidence: 99%
See 3 more Smart Citations
“…Network architectures. Following the previous works [9], we adopt the standard feature extraction network Conv4. This contains four convolutional modules, each of which consists of a convolutional layer with 3×3 size followed by batch normalization layer and ReLU layer.…”
Section: Subtle Difference Miningmentioning
confidence: 99%
“…Specifically, for 5-way 1-shot task on benchmarks, we can increase by 17.81%, 14.37%, 29.22%, and 5.42% on average over Relation Net [1], DNN [3], Prototype Net [4], MattML [9]. For 5-way 5-shot task, we also can gain by 18.73%, 1.77%, 32.25%, and 2.89 % improvement with a large margin on average over Relation Net [1], DNN [3], Prototype Net [4], MattML [9]. The results mean that our approach enables the network to better extract the subtle discriminant features, which are a benefit for the similarity measure of fine-grained images.…”
Section: Comparison With the State-of-the-artmentioning
confidence: 99%
See 2 more Smart Citations
“…LRPABN [9] proposes a pairwise bilinear pooling operator to perform feature alignment for learning an efficient distance metric. To capture more discriminative features, Zhu et al [10] develop a multi-attention meta-learning method to learn informative parts by attention mechanism. Similarly, BSNet [11] is proposed to improve the generalization ability by decreasing the similarity bias and extracting more diverse features.…”
Section: Introductionmentioning
confidence: 99%