Rotation Invariance Regularization for Remote Sensing Image Scene Classification with Convolutional Neural Networks

Qi, Kunlun; Yang, Chao; Hu, Chuli; Shen, You-Gen; Shen, Shengyu; Wu, Huayi

doi:10.3390/rs13040569

Cited by 29 publications

(18 citation statements)

References 59 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As seen, the proposed method surpasses most of the other methods. Our accuracy is similar to that of RIR [49] under 10% training samples and is better than RIR in the case of 20% training samples. Similar effects with the stochastic decision-level fusion training strategy are again observed.…”

supporting

confidence: 75%

Decision-Level Fusion with a Pluginable Importance Factor Generator for Remote Sensing Image Scene Classification

et al. 2021

View full text Add to dashboard Cite

Remote sensing image scene classification acts as an important task in remote sensing image applications, which benefits from the pleasing performance brought by deep convolution neural networks (CNNs). When applying deep models in this task, the challenges are, on one hand, that the targets with highly different scales may exist in the image simultaneously and the small targets could be lost in the deep feature maps of CNNs; and on the other hand, the remote sensing image data exhibits the properties of high inter-class similarity and high intra-class variance. Both factors could limit the performance of the deep models, which motivates us to develop an adaptive decision-level information fusion framework that can incorporate with any CNN backbones. Specifically, given a CNN backbone that predicts multiple classification scores based on the feature maps of different layers, we develop a pluginable importance factor generator that aims at predicting a factor for each score. The factors measure how confident the scores in different layers are with respect to the final output. Formally, the final score is obtained by a class-wise and weighted summation based on the scores and the corresponding factors. To reduce the co-adaptation effect among the scores of different layers, we propose a stochastic decision-level fusion training strategy that enables each classification score to randomly participate in the decision-level fusion. Experiments on four popular datasets including the UC Merced Land-Use dataset, the RSSCN 7 dataset, the AID dataset, and the NWPU-RESISC 45 dataset demonstrate the superiority of the proposed method over other state-of-the-art methods.

show abstract

supporting

confidence: 75%

Decision-Level Fusion with a Pluginable Importance Factor Generator for Remote Sensing Image Scene Classification

et al. 2021

View full text Add to dashboard Cite

show abstract

“…VGG-VD16 [45] 86.59 89.64 DCNN [47] 90.82 96.89 Fusion by Addition [41] -91.87 ACNet [37] 93.33 95.38 CNN-CapsNet [48] 93.79 96.32 SAL-TS-Net [40] 94.09 95.99 RIR + ResNet50 [49] 94.95 96.48 ResNet18 + LA (rotation) + KL (ours) 94.98 96.52…”

Section: Methodsmentioning

confidence: 99%

“…VGG-VD16 [28] 87.15 90.36 DCNN [47] 89.22 91.89 ACNet [37] 91.09 92.42 CNN-CapsNet [48] 89.03 92.60 Siamese ResNet50 [50] -92.28 SAL-TS-Net [40] 85.02 87.01 RIR + ResNet50 [49] 92 As can be seen from the results of the AID dataset and the NWPU dataset, the classification accuracy has an improvement with the increase in the training ratio, indicating that the number of training sets has an important influence on the training model. The label augmentation proposed in this paper assigns a joint label to each new image obtained by the input transformation, i.e., rotation transformation.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Remote Sensing Image Scene Classification via Label Augmentation and Intra-Class Constraint

2021

View full text Add to dashboard Cite

In recent years, many convolutional neural network (CNN)-based methods have been proposed to address the scene classification tasks of remote sensing images. Since the number of training samples in RS datasets is generally small, data augmentation is often used to expand the training set. It is, however, not appropriate when original data augmentation methods keep the label and change the content of the image at the same time. In this study, label augmentation (LA) is presented to fully utilize the training set by assigning a joint label to each generated image, which considers the label and data augmentation at the same time. Moreover, the output of images obtained by different data augmentation is aggregated in the test process. However, the augmented samples increase the intra-class diversity of the training set, which is a challenge to complete the following classification process. To address the above issue and further improve classification accuracy, Kullback–Leibler divergence (KL) is used to constrain the output distribution of two training samples with the same scene category to generate a consistent output distribution. Extensive experiments were conducted on widely-used UCM, AID and NWPU datasets. The proposed method can surpass the other state-of-the-art methods in terms of classification accuracy. For example, on the challenging NWPU dataset, competitive overall accuracy (i.e., 91.05%) is obtained with a 10% training ratio.

show abstract

“…The daily increasing RS data boosts the growing demand for the intelligent extraction of valuable information for applications in various fields ranging from land use and land cover determination, to urban planning, environmental monitoring, and natural hazard detection [5], [6]. RS scene classification is an active research field in the RS community [7], [8] aiming at providing to each image a discrete land use category with semantic meaning. Generally, RS scenes contain rich information with complex spatial patterns, while commonly, the visual differences between the categories are small.…”

Section: Introductionmentioning

confidence: 99%

Deep Object-Centric Pooling in Convolutional Neural Network for Remote Sensing Scene Classification

Yang

et al. 2021

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

Self Cite

View full text Add to dashboard Cite

Remote sensing imagery typically comprises successive background contexts and complex objects. Global average pooling is a popular choice to connect the convolutional and fullyconnected (FC) layers for the deep convolution network. This paper equips the networks with another pooling strategy, namely the deep object-centric pooling (DOCP), to pool convolutional features considering the location of an object within the scene image. The proposed DOCP network structure consists of two steps: inferring object's location and separately pooling the foreground and background features to generate an object-level representation. Specifically, a spatial context module is presented to learn the location of the object of interest in the scene image. Then, the convolutional feature maps are pooled separately in the foreground and background of the object. Finally, the FC layer concatenates these pooled features and is followed by a batch normalization layer, a dropout layer, and a softmax layer. Two challenging data sets are employed to validate our approach. The experimental results demonstrate that the proposed DOCPnet can outperform the corresponding pooling methods and achieve better classification performance than other pre-trained convolutional neural network-based scene classification methods.

show abstract

Rotation Invariance Regularization for Remote Sensing Image Scene Classification with Convolutional Neural Networks

Cited by 29 publications

References 59 publications

Decision-Level Fusion with a Pluginable Importance Factor Generator for Remote Sensing Image Scene Classification

Decision-Level Fusion with a Pluginable Importance Factor Generator for Remote Sensing Image Scene Classification

Remote Sensing Image Scene Classification via Label Augmentation and Intra-Class Constraint

Deep Object-Centric Pooling in Convolutional Neural Network for Remote Sensing Scene Classification

Contact Info

Product

Resources

About