2022
DOI: 10.1109/jstars.2021.3139017
|View full text |Cite
|
Sign up to set email alerts
|

CGSANet: A Contour-Guided and Local Structure-Aware Encoder–Decoder Network for Accurate Building Extraction From Very High-Resolution Remote Sensing Imagery

Abstract: Extracting buildings accurately from very highresolution (VHR) remote sensing imagery is challenging due to diverse building appearances, spectral variability, and complex background in VHR remote sensing images. Recent studies mainly adopt a variant of the Fully Convolutional Network (FCN) with an encoder-decoder architecture to extract buildings, which has shown promising improvement over conventional methods. However, FCN-based encoder-decoder models still fail to fully utilize the implicit characteristics … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
12
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 28 publications
(12 citation statements)
references
References 56 publications
0
12
0
Order By: Relevance
“…Post-processing based building mapping refines or vectorizes binary building masks using post-processing procedures to refine boundary or vectorize building masks. Since vectorization of building binary masks tends to bring blurred and irregular boundaries, many boundary refinement methods [14]- [17], [34] have been studied to regularize building boundaries by multi-scale feature fusion, building shape information embedding, or other post-processing procedures. BCTNet [35] proposes a bi-branch cross-fusion transformer network by using CNN and transformer to enhance multi-scale features from local and global aspects.…”
Section: A Post-processing Based Building Mappingmentioning
confidence: 99%
See 1 more Smart Citation
“…Post-processing based building mapping refines or vectorizes binary building masks using post-processing procedures to refine boundary or vectorize building masks. Since vectorization of building binary masks tends to bring blurred and irregular boundaries, many boundary refinement methods [14]- [17], [34] have been studied to regularize building boundaries by multi-scale feature fusion, building shape information embedding, or other post-processing procedures. BCTNet [35] proposes a bi-branch cross-fusion transformer network by using CNN and transformer to enhance multi-scale features from local and global aspects.…”
Section: A Post-processing Based Building Mappingmentioning
confidence: 99%
“…All these pixel-wise segmentation-based methods fail to obtain accurate building boundaries due to dense buildings and similar backgrounds in remote sensing images. To refine blurred boundaries, some studies [14]- [17] introduce boundary-preserved modules to regularize building boundaries. Although recent pixel-wise segmentation methods produce accurate buildings with precise boundaries, they usually output raster building segmentation masks, requiring a delicate post-vectorization pipeline to meet real-world geographic applications.…”
Section: Introductionmentioning
confidence: 99%
“…To address the challenge of the absence of detailed information across multiple scales at boundaries. Some researchers introduce auxiliary modules to refine the boundary information [22]. Alternatively, some other studies introduce multiscale encoder architecture [23] and the atrous spatial pyramid pooling (ASPP) [24] to obtain the multi-scale contextual information.…”
Section: Introductionmentioning
confidence: 99%
“…Supervised deep-learning-based methods are a possible solution to realise automatic and accurate building extraction from remotely sensed data. The rapid development of deep learning, especially convolutional neural networks [13][14][15][16][17][18][19][20][21] and transformers [22][23][24], has made deep-learning-based methods the mainstream for building extraction, and many impressive results have been achieved. However, deep-learning-based methods still rely on a large number of labelled samples to obtain satisfactory results, and these samples are often manually labelled, which is time and labour consuming.…”
Section: Introductionmentioning
confidence: 99%
“…If we can extract sufficient information about the buildings from the used data in an unsupervised manner, it is possible to design an unsupervised method to extract buildings automatically and accurately, avoiding manual data labelling and manual parameter(s) tuning. According to the data types used, existing building extraction methods can be divided into three categories: (1) methods based on remote sensing images [20,[31][32][33][34], (2) methods based on three-dimensional data (often LiDAR point clouds) [10,[35][36][37][38][39], and (3) methods that combine remote sensing images and three-dimensional data [6,16,30,[40][41][42][43].…”
Section: Introductionmentioning
confidence: 99%