Fine-Grained Object Classification via Self-Supervised Pose Alignment

Yang, Xuhui; Wang, Yaowei; Chen, Ke; Xu, Yong

doi:10.1109/cvpr52688.2022.00725

Cited by 48 publications

(7 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This is related to the fact that various attention regions affect the classification, limiting our ability to concentrate on the most critical information penetrating the channels. Nevertheless, we still achieve an improvement of about 1% over the average accuracy of the methods after 2020 (MC-Loss [31], PMG [28], DP-Net [30], and P2P-Net [32]), proving that our method has a performance advantage.…”

Section: Comparisons With State-of-the-art Methodsmentioning

confidence: 69%

“…For an accurate comparison, we directly refer to the accuracy of the previous methods without modifications. As an exception, for P2P- Net [32], we used the published code to conduct learning with the same parameters, enabling a more detailed comparison since it is closer to our method. Table 2 presents the results of the fine-grained image classification for the CUB, AIR, and CAR datasets.…”

Section: Comparisons With State-of-the-art Methodsmentioning

confidence: 99%

“…MGE-CNN [29] and DP-Net [30] learn the Kullback-Leibler divergence-based constraint to enhance model diversity and incorporate the position evidence into the visual content information before dynamically aligning it. Meanwhile, [31], [32] took one step further by implementing a channel-wise attention approach trained with a mutual-channel loss (MC-Loss) that yields local discriminative regions with an adaptive graph-matching algorithm to promote discriminative features via poseintensive feature regularization. Not only limited to that, they also utilize the scale pyramid of the feature map to achieve robustness in the scale of the image.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Fine-Grained Classification via Hierarchical Feature Covariance Attention Module

et al. 2023

View full text Add to dashboard Cite

Fine-Grained Visual Classification (FGVC) has consistently been challenging in various domains, such as aviation and animal breeds. It is mainly due to the FGVC's criteria that differ with a considerably small range or subtle pattern differences. In the deep convolutional neural network, the covariance between feature maps positively affects the selection of features to learn discriminative regions automatically. In this study, we propose a method for a finegrained classification model by inserting an attention module that uses covariance characteristics. Specifically, we introduce a feature map attention module (FCA) to extract the feature map between convolution blocks, constituting the existing classification model. The FCA module then applies the corresponding value of the covariance matrix to the channel to focus on the salient area. We demonstrate the need for fine-grained classification in a hierarchical manner by focusing on the diverse scale representation. Additionally, we implemented two ablation studies to show how each suggested strategy affects classification performance. Our experiments are conducted on three datasets, CUB-200-2011, Stanford Cars, and FGVC-Aircraft, primarily used for fine-grained classification tasks. Our method outperforms the state-of-the-art models by a margin of 0.4%, 1.1%, and 1.4%.

show abstract

Section: Comparisons With State-of-the-art Methodsmentioning

confidence: 69%

Section: Comparisons With State-of-the-art Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Fine-Grained Classification via Hierarchical Feature Covariance Attention Module

et al. 2023

View full text Add to dashboard Cite

show abstract

“…Currently, there are varied SSL implementations for solving fine‐grained image classification problem, for example, semantic learning from the discriminative feature‐representations of image parts (Yang et al., 2022; Yu et al., 2022), part‐level contrastive learning (Wang et al., 2022), attentively identifying fine‐grained images by interaction (Zhuang et al., 2020). However, this study shows the ability of local entropy‐mask segmentation in enhancing SSL performance to classify insect pests from complex images, as segmentation helps retain mostly the foreground portions that accentuate the learning of more meaningful representations during the pretext task, compared to the raw images.…”

Section: Resultsmentioning

confidence: 99%

Self‐supervised learning improves classification of agriculturally important insect pests in plants

Kar

Nagasubramanian

Elango

et al. 2023

The Plant Phenome Journal

View full text Add to dashboard Cite

Insect pests cause significant damage to food production, so early detection and efficient mitigation strategies are crucial. There is a continual shift toward machine learning (ML)‐based approaches for automating agricultural pest detection. Although supervised learning has achieved remarkable progress in this regard, it is impeded by the need for significant expert involvement in labeling the data used for model training. This makes real‐world applications tedious and oftentimes infeasible. Recently, self‐supervised learning (SSL) approaches have provided a viable alternative to training ML models with minimal annotations. Here, we present an SSL approach to classify 22 insect pests. The framework was assessed on raw and segmented field‐captured images using three different SSL methods, Nearest Neighbor Contrastive Learning of Visual Representations (NNCLR), Bootstrap Your Own Latent, and Barlow Twins. SSL pre‐training was done on ResNet‐18 and ResNet‐50 models using all three SSL methods on the original RGB images and foreground segmented images. The performance of SSL pre‐training methods was evaluated using linear probing of SSL representations and end‐to‐end fine‐tuning approaches. The SSL‐pre‐trained convolutional neural network models were able to perform annotation‐efficient classification. NNCLR was the best performing SSL method for both linear and full model fine‐tuning. With just 5% annotated images, transfer learning with ImageNet initialization obtained 74% accuracy, whereas NNCLR achieved an improved classification accuracy of 79% for end‐to‐end fine‐tuning. Models created using SSL pre‐training consistently performed better, especially under very low annotation, and were robust to object class imbalances. These approaches help overcome annotation bottlenecks and are resource efficient.

show abstract

“…Automatic puzzle reassembly has been widely studied (Paumard, Picard, and Tabia 2020;Bridger, Danon, and Tal 2020). The developed techniques have been used beyond puzzle reassembly, e.g., self-supervised learning of visual representations (Noroozi and Favaro 2016;Ma et al 2021;Yang et al 2022).…”

Section: Introductionmentioning

confidence: 99%

Siamese-Discriminant Deep Reinforcement Learning for Solving Jigsaw Puzzles with Large Eroded Gaps

Song

Jin

Yao

et al. 2023

AAAI

View full text Add to dashboard Cite

Jigsaw puzzle solving has recently become an emerging research area. The developed techniques have been widely used in applications beyond puzzle solving. This paper focuses on solving Jigsaw Puzzles with Large Eroded Gaps (JPwLEG). We formulate the puzzle reassembly as a combinatorial optimization problem and propose a Siamese-Discriminant Deep Reinforcement Learning (SD2RL) to solve it. A Deep Q-network (DQN) is designed to visually understand the puzzles, which consists of two sets of Siamese Discriminant Networks, one set to perceive the pairwise relations between vertical neighbors and another set for horizontal neighbors. The proposed DQN considers not only the evidence from the incumbent fragment but also the support from its four neighbors. The DQN is trained using replay experience with carefully designed rewards to guide the search for a sequence of fragment swaps to reach the correct puzzle solution. Two JPwLEG datasets are constructed to evaluate the proposed method, and the experimental results show that the proposed SD2RL significantly outperforms state-of-the-art methods.

show abstract

Fine-Grained Object Classification via Self-Supervised Pose Alignment

Cited by 48 publications

References 30 publications

Fine-Grained Classification via Hierarchical Feature Covariance Attention Module

Fine-Grained Classification via Hierarchical Feature Covariance Attention Module

Self‐supervised learning improves classification of agriculturally important insect pests in plants

Siamese-Discriminant Deep Reinforcement Learning for Solving Jigsaw Puzzles with Large Eroded Gaps

Contact Info

Product

Resources

About