The rich context information and multiscale ground information in remote sensing images are crucial to improving the semantic segmentation accuracy. Therefore, we propose a remote sensing image semantic segmentation method that integrates multilevel spatial channel attention and multi-scale dilated convolution, effectively addressing the issue of poor segmentation performance of small target objects in remote sensing images. This method builds a multilevel characteristic fusion structure, combining deep-level semantic characteristics with the details of the shallow levels to generate multiscale feature diagrams. Then, we introduce the dilated convolution of the series combination in each layer of the atrous spatial pyramid pooling structure to reduce the loss of small target information. Finally, using convolutional conditional random field to describe the context information on the space and edges to improve the model's ability to extract details. We prove the effectiveness of the model on the three public datasets. From the quantitative point of view, we mainly evaluate the four indicators of the model's F1 score, overall accuracy (OA), Intersection over Union (IoU), and Mean Intersection over Union (MIoU). On GID dataset, F1 score, OA, and MIoU reach 87.27, 87.80, and 77.70, respectively, superior to most mainstream semantic segmentation networks.
The extracted building information can be widely applied in urban planning, land resource management, and other related fields. This paper proposes a novel method for building extraction, which aims to improve the accuracy of the extraction process. The method combines a bi-directional feature pyramid with a location-channel attention feature serial fusion module (L-CAFSFM). By using the ResNeXt101 network, more precise and abundant building features are extracted. The L-CAFSFM combines and calculates the adjacent two-level feature maps, and the iteration process from high-level to low-level and from low-level to high-level enhances the feature extraction ability of the model at different scales and levels. We use the DenseCRF algorithm to refine the correlation between pixels. The performance of our method is evaluated on the Wuhan University building dataset (WHU), and the experimental results show that the precision, F-score, recall rate, and IoU of our method are 94.94%, 94.32%, 93.70%, and 89.25%, respectively. Compared with the baseline network, our method achieves a more accurate performance in extracting buildings from high-resolution images. The proposed method can be widely applied in urban planning, land resource management, and other related fields.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.