A collaborative gated attention network for fine-grained visual classification

Zhu, Qiangxi; Kuang, Wenlan; Li, Zhixin

doi:10.1016/j.displa.2023.102468

Cited by 11 publications

(3 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…According to the experimental data in Table 4, on the FGVC-Aircraft dataset, our PGPL improves the classification accuracy by 0.4% and 1.1% compared with the collaborative gated attention network (CG) 1 and progressive multi-granularity (PMG) models, respectively. The CG model emphasizes the interrelationship among cross-layer features and uses channel and spatial attention modules to locate key part features to optimize model performance.…”

Section: Evaluation and Analysis On The Fgvc-aircraft Datasetmentioning

confidence: 99%

“…Fine-grained image classification (FGIC) aims to distinguish specific subcategories from the same superclass. 1 It plays a key role in many science and engineering fields, such as environmental protection, 2 intelligent transportation, 3 and medical image diagnosis. 4 Compared with general image classification, FGIC is more challenging due to the minor inter-class differences and the large intra-class variations.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Posture-guided part learning for fine-grained image categorization

Song,

Chen

2024

J. Electron. Imag.

View full text Add to dashboard Cite

The challenge in fine-grained image classification tasks lies in distinguishing subtle differences among fine-grained images. Existing image classification methods often only explore information in isolated regions without considering the relationships among these parts, resulting in incomplete information and a tendency to focus on individual parts. Posture information is hidden among these parts, so it plays a crucial role in differentiating among similar categories. Therefore, we propose a posture-guided part learning framework capable of extracting hidden posture information among regions. In this framework, the dual-branch feature enhancement module (DBFEM) highlights discriminative information related to fine-grained objects by extracting attention information between the feature space and channels. The part selection module selects multiple discriminative parts based on the attention information from DBFEM. Building upon this, the posture feature fusion module extracts semantic features from discriminative parts and constructs posture features among different parts based on these semantic features. Finally, by fusing part semantic features with posture features, a comprehensive representation of finegrained object features is obtained, aiding in differentiating among similar categories. Extensive evaluations on three benchmark datasets demonstrate the competitiveness of the proposed framework compared with state-of-the-art methods.

show abstract

Section: Evaluation and Analysis On The Fgvc-aircraft Datasetmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Posture-guided part learning for fine-grained image categorization

Song,

Chen

2024

J. Electron. Imag.

View full text Add to dashboard Cite

show abstract

“…The peak response region feature of the feature map is used to cluster the channels with similar response regions to obtain local regions with discrimination. At the same time, the channel grouping loss function is used to increase the inter-class differentiation and reduce the intra-class differentiation; Zhang et al [19] control the contribution of different regions to recognition through the gating mechanism; Zhu et al [20] proposed a simple and effective cross door attention learning strategy that guides the final classification through rich discriminative features in key regions, and achieved good results.…”

Section: Related Workmentioning

confidence: 99%

Fine-grained image classification network based on complementary learning

Jing,

Yao,

MIn

et al. 2023

Preprint

View full text Add to dashboard Cite

Abstract:The objects of fine-grained image categories(e.g., bird species) are various subclass under different categories. Because the differences between subclass are very subtle and most of them are concentrated in multiple local areas, the task of fine-grained image recognition is very challenging. At the same time, some fine-grained networks tend to focus on a certain region when judging the target category, resulting in the lack of other auxiliary regional features. To this end, Inception V3 is used as the backbone network, and an enhanced and complementary fine-grained image classification network is designed. While adopting the method of reinforcement learning to obtain more detailed fine grain image features, the complementary network can obtain the complementary discriminant area of the target through the method of attention erasure to increase the network's perception of the overall target. Finally, experiments are conducted on CUB-200-2011, FGVC Aircraft and Stanford dogs three open datasets. The experimental results show that the proposed model has better performance.

show abstract