2020
DOI: 10.1109/jstars.2020.3021045
|View full text |Cite
|
Sign up to set email alerts
|

Diverse Capsules Network Combining Multiconvolutional Layers for Remote Sensing Image Scene Classification

Abstract: Remote sensing image scene classification has drawn significant attention for its potential applications in the economy and livelihoods. Unlike the traditional handcrafted features, the convolutional neural networks (CNNs) provides an excellent avenue in obtaining powerful discriminative features. Although tremendous efforts have been made so far in this domain, there are still many open challenges in scene classification due to the scene complexity with higher within-class diversity and between-class similari… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
15
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 18 publications
(15 citation statements)
references
References 67 publications
0
15
0
Order By: Relevance
“…Compare with AlexNet [53], VGG-VD16 [53], and Scenario (II) [10], which only uses the original pre-trained CNN model, DOCP-net manages a higher classification performance due to combining object-level and scene-level features. The proposed method obtains comparable results with D-CapsNet [54] and VGG-VD16+CapsNet [55]. According to Fig.…”
Section: E Compared With Other Pre-trained Cnn-based Methodsmentioning
confidence: 72%
See 2 more Smart Citations
“…Compare with AlexNet [53], VGG-VD16 [53], and Scenario (II) [10], which only uses the original pre-trained CNN model, DOCP-net manages a higher classification performance due to combining object-level and scene-level features. The proposed method obtains comparable results with D-CapsNet [54] and VGG-VD16+CapsNet [55]. According to Fig.…”
Section: E Compared With Other Pre-trained Cnn-based Methodsmentioning
confidence: 72%
“…This model may have an advantage when the data set's scale increases. In [60], a self-attention- [10] 96.90 (0.77) AlexNet+SPP [17] 95.95 (1.01) CCP-net [18] 97.52 (0.97) AlexNet+MSCP [56] 97.29 (0.63) VGG-VD16+MSCP [56] 98.36 (0.58) BAFF [57] 95.48 (0.22) RADC-Net [58] 97.05 (0.48) D-CapsNet [54] 99.05 (0.12) DDRL-AM [59] 99.05 (0.08) VGG-VD16+CapsNet [55] 98.81 (0.12) AlexNet+SAFF [60] 96.13 ( 0.97) VGG-VD16+SAFF [60] 97.02 ( 0.78) [18], a concentric circle pooling layer is proposed to incorporate rotation-invariant spatial layout information of remote sensing scene images.…”
Section: E Compared With Other Pre-trained Cnn-based Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…. , N. We average these feature maps to obtain a teacher feature map F H * W e as (10). The student is the main-branch feature map after normalization F H * W m .…”
Section: Self-distillationmentioning
confidence: 99%
“…The encoder contains different modules for feature extraction, including dense module, multiscale module, feature fusion module, channel-wise attention (CWA) [28] module and fused feature bank. In order to receive a visible or infrared image of arbitrary size, this image is first processed via the convolutional layer, where kernel size is 3×3, and the stride is 1.…”
Section: A the Encodermentioning
confidence: 99%