High-resolution remote sensing imagery comprises spatial structure features of multispectral bands varying in scale, color, and shape. These heterogeneous geographical features introduce grave challenges to the fine segmentation required for classification applications in remote sensing imagery, where direct application of traditional image classification models fails to deliver optimal results. To overcome these challenges, a multispectral, multi-label model, MMDL-Net, has been developed. This model is integrated with the multi-label BigEarthNet dataset, primarily employed for land cover classification research in remote sensing imagery, with each image composed of 13 spectral bands and spatial resolutions of 10 m, 20 m, and 60 m. To effectively utilize the information across these bands, a multispectral stacking module has been introduced to concatenate this spectral information. To proficiently process three distinct large-scale remote sensing image datasets, a multi-label classification module has been incorporated for training and inference. To better learn and represent the intricate features within the images, a twin-number residual structure has been proposed. The results demonstrate that the MMDL-Net model achieves a top accuracy of 83.52% and an F1 score of 77.97%, surpassing other deep learning models and conventional methods, thereby exhibiting exceptional performance in the task of multispectral multi-label classification of remote sensing imagery.