Among tasks related to intelligent interpretation of remote sensing data, scene classification mainly focuses on the holistic information of the entire scene. Compared to pixel-level or object-based tasks, it involves a richer semantic context, making it more challenging. With the rapid advancement of deep learning, convolutional neural networks (CNNs) have found widespread applications across various domains, and some work has introduced them into scene classification tasks. However, traditional convolution operations involve sliding small convolutional kernels across an image, primarily focusing on local details within a small receptive field. To achieve better modeling of the entire image, the smaller receptive field limits the ability of convolution operation to capture features over a broader range. To this end, we introduce a large kernel CNNs into the scene classification task to expand the receptive field of the mode, which allows us to capture comprehensive non-local information while still acquiring rich local details. However, in addition to encoding spatial association, the effective information within the feature maps is also strongly channel-related. Therefore, to fully model this channel dependency, a novel channel separation and mixing module has been designed to realize feature correlation in the channel dimension. The combination of them forms a Large Kernel Separable Mixed Convnet (LSMNet), enabling the model to capture effective dependencies of feature maps in both spatial and channel dimensions, thus achieving enhanced feature expression. Extensive experiments conducted on three datasets have also validated the effectiveness of the proposed method.