Due to the different imaging mechanisms between optical and polarimetric synthetic aperture radar (PolSAR) images, determining how to effectively use such complementary information has become an interesting and challenging problem. Convolutional neural networks (CNNs) and other deep neural networks have achieved good experimental results in remote sensing land-cover semantic segmentation. However, the CNN convolution structure can extract only the features within the receptive field in the spatial dimension without focusing on the relationship between multiple channels; therefore, it is impossible to realize fusion and complementarity between multiple channels. In this paper, we propose a novel spatial dense channel attention fusion network (SDCAFNet), which takes optical and PolSAR images as different inputs and completes feature fusion and semantic segmentation within a neural network. First, SDCAFNet uses a two-stream siamese CNN network to realize the preliminary feature coding of optical and PolSAR images. Then, a spatial dense channel attention module (SDCAM) is proposed. The channel activation values obtained at different positions are combined in the spatial dense matrix, which can describe the attention in the feature fusion process. Finally, we introduce the fused features into the symmetric skip-connection decoder composed of multiple symmetric decoder blocks to realize end-to-end land-cover semantic segmentation. Experimental results show that SDCAFNet can effectively learn the correlation between optical and PolSAR channels and has a better segmentation accuracy than other methods.