Weighted Spatial Pyramid Matching Collaborative Representation for Remote-Sensing-Image Scene Classification

Liu, Baodi; Meng, Jie; Xie, Wen-Yang; Shao, Shuai; Li, Ye; Wang, Yanjiang

doi:10.3390/rs11050518

Cited by 56 publications

(38 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The drawbacks of handcrafted features can be overcome by unsupervised learning-based features that have been studied by many researchers [30][31][32]. The input of unsupervised learning-based features is some handcrafted features, while the statistics of handcrafted features are output.…”

Section: R E T R a C T E Dmentioning

confidence: 99%

RETRACTED: Attention-Based Deep Feature Fusion for the Scene Classification of High-Resolution Remote Sensing Images

Zhu

Yan

et al. 2019

Remote Sensing

View full text Add to dashboard Cite

Scene classification of highresolution remote sensing images (HRRSI) is one of the most important means of landcover classification. Deep learning techniques, especially the convolutional neural network (CNN) have been widely applied to the scene classification of HRRSI due to the advancement of graphic processing units (GPU). However, they tend to extract features from the whole images rather than discriminative regions. The visual attention mechanism can force the CNN to focus on discriminative regions, but it may suffer from the influence of intraclass diversity and repeated texture. Motivated by these problems, we propose an attention-based deep feature fusion (ADFF) framework that constitutes three parts, namely attention maps generated by Gradientweighted Class Activation Mapping (GradCAM), a multiplicative fusion of deep features and the centerbased cross-entropy loss function. First of all, we propose to make attention maps generated by GradCAM as an explicit input in order to force the network to concentrate on discriminative regions. Then, deep features derived from original images and attention maps are proposed to be fused by multiplicative fusion in order to consider both improved abilities to distinguish scenes of repeated texture and the salient regions. Finally, the centerbased cross-entropy loss function that utilizes both the cross-entropy loss and center loss function is proposed to backpropagate fused features so as to reduce the effect of intraclass diversity on feature representations. The proposed ADFF architecture is tested on three benchmark datasets to show its performance in scene classification. The experiments confirm that the proposed method outperforms most competitive scene classification methods with an average overall accuracy of 94% under different training ratios.

show abstract

Section: R E T R a C T E Dmentioning

confidence: 99%

RETRACTED: Attention-Based Deep Feature Fusion for the Scene Classification of High-Resolution Remote Sensing Images

Zhu

Yan

et al. 2019

Remote Sensing

View full text Add to dashboard Cite

show abstract

“…In [22], three kinds of pre-trained CNN models were employed as feature extractors, and the features from the first fully connected layer were used for classification. In [44,46,[51][52][53]71], deep features from pre-trained CNNs were reprocessed to achieved excellent performance. In [73], a deep-learning-based classification method was presented to improve classification performance by combining pre-trained CNNs and extreme learning machine (ELM).…”

Section: Comparisons With the Most Recent Methodsmentioning

confidence: 99%

“…Published Time Classification Accuracy (%) MS-CLBP+FV [70] 2017 86.48 ± 0.27 GoogLeNet [22] 2017 86.39 ± 0.55 VGG-VD-16 [22] 2017 89.64 ± 0.36 CaffeNet [22] 2017 89.53 ± 0.31 DCA with concatenation [55] 2017 89.71 ± 0.33 Fusion by concatenation [55] 2017 91.86 ± 0.28 Fusion by addition [55] 2017 91.87 ± 0.36 Bidirectional adaptive feature fusion [72] 2017 93.56 salM 3 LBP-CLM [74] 2017 89.76 ± 0.45 TEX-Net-LF [56] 2017 92.96 ± 0.18 Converted CaffeNet [78] 2018 92.17 ± 0.31 Two-stage deep feature fusion [78] 2018 94.65 ± 0.33 Multilevel fusion [79] 2018 94.17 ± 0.32 ARCNet-VGG16 [4] 2019 93.10 ± 0.55 VGG19 + Hybrid-KCRC (RBF) [52] 2018 91.82 VGG-16-CapsNet [43] 2019 94.74 ± 0.17 VGG19 + SPM-CRC [51] 2019 92.55 VGG19 + WSPM-CRC [51] 2019 92.57 CTFCNN Ours 94.91 ± 0.24…”

Section: Methodsmentioning

confidence: 99%

Combing Triple-Part Features of Convolutional Neural Networks for Scene Classification in Remote Sensing

Hong

2019

Remote Sensing

View full text Add to dashboard Cite

High spatial resolution remote sensing (HSRRS) images contain complex geometrical structures and spatial patterns, and thus HSRRS scene classification has become a significant challenge in the remote sensing community. In recent years, convolutional neural network (CNN)-based methods have attracted tremendous attention and obtained excellent performance in scene classification. However, traditional CNN-based methods focus on processing original red-green-blue (RGB) image-based features or CNN-based single-layer features to achieve the scene representation, and ignore that texture images or each layer of CNNs contain discriminating information. To address the above-mentioned drawbacks, a CaffeNet-based method termed CTFCNN is proposed to effectively explore the discriminating ability of a pre-trained CNN in this paper. At first, the pretrained CNN model is employed as a feature extractor to obtain convolutional features from multiple layers, fully connected (FC) features, and local binary pattern (LBP)-based FC features. Then, a new improved bag-of-view-word (iBoVW) coding method is developed to represent the discriminating information from each convolutional layer. Finally, weighted concatenation is employed to combine different features for classification. Experiments on the UC-Merced dataset and Aerial Image Dataset (AID) demonstrate that the proposed CTFCNN method performs significantly better than some state-of-the-art methods, and the overall accuracy can reach 98.44% and 94.91%, respectively. This indicates that the proposed framework can provide a discriminating description for HSRRS images. 2 of 23 color features, spectral features, and multi-feature fusion [23,24]. However, these hand-crafted features are limited in describing complex scenes of HSRRS, which will affect the classification performance. Compared with the low-level methods, the mid-level methods aim to obtain a global representation of a scene by encoding local descriptors, e.g., scale-invariant feature transform, histogram of oriented gradient and color histogram. The bag-of-view-word (BoVW) model is one of the most popular feature encoding approaches [25]. Due to its simplicity and efficiency, the BoVW model is widely applied for mid-level scene description [26][27][28][29]. However, the quantization error of the BoVW method is large, and some important information may be lost. Therefore, many feature coding methods were developed to reduce the reconstruction error, including improved Fisher kernel (IFK) [30], vectors of locally aggregated descriptors (VLAD) [31], spatial pyramid matching kernel (SPM) [32], locality-constrained linear coding (LLC) [33], latent semantic analysis (LSA), probabilistic latent semantic analysis (pLSA) [34,35], and latent dirichlet allocation (LDA) [35]. However, both low-level and mid-level methods are mainly based on hand-crafted features, which are difficult to effectively describe in HSRRS scene images with complex land-cover/land-use (LULC) situations.In recent years, deep-learning-based methods have made a...

show abstract

“…First, the pixel values from 0 to 255 in R, G, and B are normalized to generate a value from 0 to 1. The formula is shown in (3).…”

Section: A Feature Extractionmentioning

confidence: 99%

“…Since the development of remote sensing technology, this technology has been applied to various fields, such as image identification [1], [2], [36], scene classification [3]- [5], and semantic segmentation [6]- [8]. Research on marine remote sensing has become increasingly important [9]- [14].…”

Section: Introductionmentioning

confidence: 99%

Semantic Segmentation of Marine Remote Sensing Based on a Cross Direction Attention Mechanism

Gao

Cao

et al. 2020

IEEE Access

View full text Add to dashboard Cite

With the development of remote sensing technology, the semantic segmentation and recognition of various things in the ocean have become more and more frequent. Due to the wide variety of marine things and the large differences in morphology, it has brought greater difficulties to the recognition of marine remote sensing images. In order to obtain better segmentation results of ocean remote sensing images, this paper proposes an cross attention mechanism(Horizontal and Vertical) of exponential operation combined with multi-scale convolution algorithm. Among them, the cross attention mechanism and expanded distribution weight coefficient mentioned in this paper are first proposed. First, Input the marine remote sensing image features into an cross attention mechanism algorithm of exponential operation to obtain feature weight coefficients and joint weight coefficients in multiple directions; Then, the features with weight coefficients are input into the multi-access convolutional layer and the multi-scale dilated convolutional layer respectively for deep feature mining; Then the above steps are repeated twice, and finally the semantic segmentation of marine remote sensing images is achieved by fusing multiple deeplevel features afterwards. Experiments were conducted on three public marine remote sensing data sets, and the results proved the effectiveness of our proposed cross attention mechanism of extended operation algorithm. The F values of the MAMC model on Beach, Island and Sea ice data sets have reached 99.4%, 91.25%, 87.08% respectively. Compared with other models, the effect is significantly improved, and proved the powerful performance of the algorithm in the semantic segmentation of marine remote sensing images.

show abstract

Weighted Spatial Pyramid Matching Collaborative Representation for Remote-Sensing-Image Scene Classification

Cited by 56 publications

References 30 publications

RETRACTED: Attention-Based Deep Feature Fusion for the Scene Classification of High-Resolution Remote Sensing Images

RETRACTED: Attention-Based Deep Feature Fusion for the Scene Classification of High-Resolution Remote Sensing Images

Combing Triple-Part Features of Convolutional Neural Networks for Scene Classification in Remote Sensing

Semantic Segmentation of Marine Remote Sensing Based on a Cross Direction Attention Mechanism

Contact Info

Product

Resources

About