CGSANet: A Contour-Guided and Local Structure-Aware Encoder–Decoder Network for Accurate Building Extraction From Very High-Resolution Remote Sensing Imagery

Chen, Shanxiong; Shi, Wenzhong; Zhou, Mingting; Zhang, Min; Xuan, Zhaoxin

doi:10.1109/jstars.2021.3139017

Cited by 28 publications

(12 citation statements)

References 56 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Post-processing based building mapping refines or vectorizes binary building masks using post-processing procedures to refine boundary or vectorize building masks. Since vectorization of building binary masks tends to bring blurred and irregular boundaries, many boundary refinement methods [14]- [17], [34] have been studied to regularize building boundaries by multi-scale feature fusion, building shape information embedding, or other post-processing procedures. BCTNet [35] proposes a bi-branch cross-fusion transformer network by using CNN and transformer to enhance multi-scale features from local and global aspects.…”

Section: A Post-processing Based Building Mappingmentioning

confidence: 99%

“…All these pixel-wise segmentation-based methods fail to obtain accurate building boundaries due to dense buildings and similar backgrounds in remote sensing images. To refine blurred boundaries, some studies [14]- [17] introduce boundary-preserved modules to regularize building boundaries. Although recent pixel-wise segmentation methods produce accurate buildings with precise boundaries, they usually output raster building segmentation masks, requiring a delicate post-vectorization pipeline to meet real-world geographic applications.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Multi-Layer Fault-Tolerant Protection Strategies for Hybrid Distribution Transformers Integrated Photovoltaic Systems

Zhang

Liu

Wang

et al. 2023

IEEE Trans. on Ind. Applicat.

View full text Add to dashboard Cite

Deep learning-based methods have been extensively explored for automatic building mapping from high-resolution remote sensing images over recent years. While most building mapping models produce vector polygons of buildings for geographic and mapping systems, dominant methods typically decompose polygonal building extraction in some sub-problems, including segmentation, polygonization, and regularization, leading to complex inference procedures, low accuracy, and poor generalization. In this paper, we propose a simple and novel building mapping method with Hierarchical Transformers, called HiT, improving polygonal building mapping quality from high-resolution remote sensing images. HiT builds on a two-stage detection architecture by adding a polygon head parallel to classification and bounding box regression heads. HiT simultaneously outputs building bounding boxes and vector polygons, which is fully end-to-end trainable. The polygon head formulates a building polygon as serialized vertices with the bidirectional characteristic, a simple and elegant polygon representation avoiding the start or end vertex hypothesis. Under this new perspective, the polygon head adopts a transformer encoder-decoder architecture to predict serialized vertices supervised by the designed bidirectional polygon loss. Furthermore, a hierarchical attention mechanism combined with convolution operation is introduced in the encoder of the polygon head, providing more geometric structures of building polygons at vertex and edge levels. Comprehensive experiments on two benchmarks (the CrowdAI and Inria datasets) demonstrate that our method achieves a new state-of-the-art in terms of instance segmentation and polygonal metrics compared with state-of-theart methods. Moreover, qualitative results verify the superiority and effectiveness of our model under complex scenes.

show abstract

Section: A Post-processing Based Building Mappingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Multi-Layer Fault-Tolerant Protection Strategies for Hybrid Distribution Transformers Integrated Photovoltaic Systems

Zhang

Liu

Wang

et al. 2023

IEEE Trans. on Ind. Applicat.

View full text Add to dashboard Cite

show abstract

“…To address the challenge of the absence of detailed information across multiple scales at boundaries. Some researchers introduce auxiliary modules to refine the boundary information [22]. Alternatively, some other studies introduce multiscale encoder architecture [23] and the atrous spatial pyramid pooling (ASPP) [24] to obtain the multi-scale contextual information.…”

Section: Introductionmentioning

confidence: 99%

MDCGA-Net: Multiscale Direction Context-Aware Network With Global Attention for Building Extraction From Remote Sensing Images

Niu,

Gu,

Zhang

et al. 2024

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

View full text Add to dashboard Cite

Building extraction from remote sensing images (RSIs) requires exploring multi-scale boundary detailed information and extracting it completely, which is challenging but indispensable. However, existing solutions tend to augment feature information solely through multi-scale fusion and apply attention mechanisms to focus on feature relationships within a single layer while ignoring the multi-scale information, which affects segmentation results. Therefore, enhancing the capability of the network to adaptively capture multi-scale information and capture the global relationship of features remains a pivotal challenge in overcoming the aforementioned hurdles. To address the preceding challenge, we propose a Multi-scale Direction Contextaware network with Global Attention (MDCGA-Net), employing a classic encoder-decoder architecture enhanced with direction information and global attention flow. Specifically, in the encoder part, the multi-scale layer (MSL) is used to extract contextual information from the inter-layer. Additionally, the multi-scale direction context-aware module (MDCM) is adopted to adaptively acquire multi-scale information. In the decoder part, we propose a global attention gate module (GAGM) to capture discriminative features. Furthermore, we construct an operation of attention feature flow to obtain the global relationship among the different features with long-range dependencies, which guarantees the integrity of results. Finally, we have performed comprehensive experiments on three public datasets to showcase the efficacy and efficiency of MDCGA-Net in building extraction.

show abstract

“…Supervised deep-learning-based methods are a possible solution to realise automatic and accurate building extraction from remotely sensed data. The rapid development of deep learning, especially convolutional neural networks [13][14][15][16][17][18][19][20][21] and transformers [22][23][24], has made deep-learning-based methods the mainstream for building extraction, and many impressive results have been achieved. However, deep-learning-based methods still rely on a large number of labelled samples to obtain satisfactory results, and these samples are often manually labelled, which is time and labour consuming.…”

Section: Introductionmentioning

confidence: 99%

“…If we can extract sufficient information about the buildings from the used data in an unsupervised manner, it is possible to design an unsupervised method to extract buildings automatically and accurately, avoiding manual data labelling and manual parameter(s) tuning. According to the data types used, existing building extraction methods can be divided into three categories: (1) methods based on remote sensing images [20,[31][32][33][34], (2) methods based on three-dimensional data (often LiDAR point clouds) [10,[35][36][37][38][39], and (3) methods that combine remote sensing images and three-dimensional data [6,16,30,[40][41][42][43].…”

Section: Introductionmentioning

confidence: 99%

Unsupervised Building Extraction from Multimodal Aerial Data Based on Accurate Vegetation Removal and Image Feature Consistency Constraint

et al. 2022

Self Cite

View full text Add to dashboard Cite

Accurate building extraction from remotely sensed data is difficult to perform automatically because of the complex environments and the complex shapes, colours and textures of buildings. Supervised deep-learning-based methods offer a possible solution to solve this problem. However, these methods generally require many high-quality, manually labelled samples to obtain satisfactory test results, and their production is time and labour intensive. For multimodal data with sufficient information, extracting buildings accurately in as unsupervised a manner as possible. Combining remote sensing images and LiDAR point clouds for unsupervised building extraction is not a new idea, but existing methods often experience two problems: (1) the accuracy of vegetation detection is often not high, which leads to limited building extraction accuracy, and (2) they lack a proper mechanism to further refine the building masks. We propose two methods to address these problems, combining aerial images and aerial LiDAR point clouds. First, we improve two recently developed vegetation detection methods to generate accurate initial building masks. We then refine the building masks based on the image feature consistency constraint, which can replace inaccurate LiDAR-derived boundaries with accurate image-based boundaries, remove the remaining vegetation points and recover some missing building points. Our methods do not require manual parameter tuning or manual data labelling, but still exhibit a competitive performance compared to 29 methods: our methods exhibit accuracies higher than or comparable to 19 state-of-the-art methods (including 8 deep-learning-based methods and 11 unsupervised methods, and 9 of them combine remote sensing images and 3D data), and outperform the top 10 methods (4 of them combine remote sensing images and LiDAR data) evaluated using all three test areas of the Vaihingen dataset on the official website of the ISPRS Test Project on Urban Classification and 3D Building Reconstruction in average area quality. These comparative results verify that our unsupervised methods combining multisource data are very effective.

show abstract

CGSANet: A Contour-Guided and Local Structure-Aware Encoder–Decoder Network for Accurate Building Extraction From Very High-Resolution Remote Sensing Imagery

Cited by 28 publications

References 56 publications

Multi-Layer Fault-Tolerant Protection Strategies for Hybrid Distribution Transformers Integrated Photovoltaic Systems

Multi-Layer Fault-Tolerant Protection Strategies for Hybrid Distribution Transformers Integrated Photovoltaic Systems

MDCGA-Net: Multiscale Direction Context-Aware Network With Global Attention for Building Extraction From Remote Sensing Images

Unsupervised Building Extraction from Multimodal Aerial Data Based on Accurate Vegetation Removal and Image Feature Consistency Constraint

Contact Info

Product

Resources

About