Semantic segmentation of remote sensing images encounters four significant difficulties: 1) complex backgrounds, 2) large-scale differences, 3) numerous small objects, and 4) extreme foreground-background imbalance. However, the existing generic semantic segmentation models mainly focus on the modeling context information and rarely focus on these four issues. This article presents an enhanced remote sensing image semantic segmentation framework to solve these problems through the Hierarchical Atrous Pyramid module (HASP) and spatial-adaptive convolution based FPN decoder framework (SPA-FPN). On the one hand, HASP solved the problem of complex backgrounds and largescale differences by further enlarging the receptive field of the network through the cascade of atrous convolution with various rates. On the other hand, spatial adaptive convolution is embedded in FPN decoder framework step by step to solve the problems of numerous small objects and extreme foreground-background imbalance. Besides, a boundary-based loss function is constructed to help the network optimize the relevant segmentation results. Extensive experiments over iSAID and ISPRS Vaihingen datasets reflect the superiority of the presented structure to conventional state-of-the-art semantic segmentation approaches. The code will be accessible. ( https://github.com/jlhou/SPANet ).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.