2020
DOI: 10.3390/rs12040701
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Scale Context Aggregation for Semantic Segmentation of Remote Sensing Images

Abstract: The semantic segmentation of remote sensing images (RSIs) is important in a variety of applications. Conventional encoder-decoder-based convolutional neural networks (CNNs) use cascade pooling operations to aggregate the semantic information, which results in a loss of localization accuracy and in the preservation of spatial details. To overcome these limitations, we introduce the use of the high-resolution network (HRNet) to produce high-resolution features without the decoding stage. Moreover, we enhance the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
60
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 132 publications
(60 citation statements)
references
References 32 publications
0
60
0
Order By: Relevance
“…In recent years, networks such as AlexNet, VGG16, GoogleNet, and InceptionV3 have shown strong generalization capabilities in scene recognition. The use of ready-made pre-trained CNN models as a general feature extractor has become a method of remote sensing scene classification [19]. However, generating large parameters during the training process leads to higher requirements on hardware devices.…”
Section: Related Workmentioning
confidence: 99%
“…In recent years, networks such as AlexNet, VGG16, GoogleNet, and InceptionV3 have shown strong generalization capabilities in scene recognition. The use of ready-made pre-trained CNN models as a general feature extractor has become a method of remote sensing scene classification [19]. However, generating large parameters during the training process leads to higher requirements on hardware devices.…”
Section: Related Workmentioning
confidence: 99%
“…Inspired by natural image semantic segmentation solutions, many targeted approaches, based on encoder-decoder networks, are designed for remote-sensing images. To acquire the desired accurate segmentation and localization, Zhang, Lin, and Ding et al (2020) combined HRNet (High-resolution network) with ASP (Adaptive spatial pooling) to increase localization accuracy and preserve spatial details. The experimental results on the Vaihingen dataset (http://www2.isprs.org/commissions/comm3/wg4/2dsem-label-vaihingen.html) and Potsdam dataset (http://www2.isprs.org/commissions/ comm3/wg4/2d-sem-label-potsdam.html) reach SOTA.…”
Section: Introductionmentioning
confidence: 99%
“…Finally, the comprehensive evaluation of three sets of data sets highlights the superiority of this method. Zhang et al [18] introduced a high-resolution network (HRNet) to enhance features to obtain contextual semantic information. This method uses the spatial linking method of the model to show more semantics for the low-resolution information containing more semantic information to the high-resolution information, and enhance the high-resolution information, thereby solving the positioning caused by cascading pooling.…”
Section: Introductionmentioning
confidence: 99%