This paper focuses on the high-resolution (HR) remote sensing images semantic segmentation task, whose goal is to predict semantic labels in a pixel-wise manner. Due to the rich complexity and heterogeneity of information in HR remote sensing images, the ability to extract spatial details (boundary information) and semantic context information dominates the performance in segmentation. In this paper, based on the frequently used fully convolutional network framework, we propose a boundary enhancing semantic context network (BES-Net) to explicitly use the boundary to enhance semantic context extraction. BES-Net mainly consists of three modules: (1) a boundary extraction module for extracting the semantic boundary information, (2) a multi-scale semantic context fusion module for fusing semantic features containing objects with multiple scales, and (3) a boundary enhancing semantic context module for explicitly enhancing the fused semantic features with the extracted boundary information to improve the intra-class semantic consistency, especially in those pixels containing boundaries. Extensive experimental evaluations and comprehensive ablation studies on the ISPRS Vaihingen and Potsdam datasets demonstrate the effectiveness of BES-Net, yielding an overall improvement of 1.28/2.36/0.72 percent in mF1/mIoU/OA over FCN_8s when the BE and MSF modules are combined by the BES module. In particular, our BES-Net achieves a state-of-the-art performance of 91.4% OA on the ISPRS Vaihingen dataset and 92.9%/91.5% mF1/OA on the ISPRS Potsdam dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.