Fine-grained ocean ship classification plays a crucial role in maritime military surveillance, traffic management, and anti-smuggling operations. However, the complex backgrounds of remote sensing images (RSIs), as well as significant inter-class similarities and intra-class differences, result in poor classification performance. Hence, we propose MSCL-Net, a multi-scale contrastive learning network for fine-grained ship classification (FGSC). First, we introduce ResNet50 as the backbone network and extract the multi-layer features by using the FPN for FGSC. Second, a channel spatial attention module (CSAM) is proposed to extract the similarity (contrastive) feature of the same class, strengthening the representation learning ability for addressing issues caused by significant inter-class similarity and intra-class difference. Third, a region cropping and enlargement module (RCEM) is proposed to extract the fine-grained features of local discriminant regions in RSIs to overcome the challenge of background complexity. Finally, we used the CSAM to fuse the features of the original image and the cropped region image for FGSC. Additionally, we introduce a combined loss based on center loss and PolyLoss to enhance the discrimination ability of features and make it more suitable for the imbalance dataset compared with cross-entropy. We use a public finegrained ship classification dataset, FGSC-23, and our FGSC-41 to evaluate the performance of MSCL-Net. The experimental results show superior performance compared to other stateof-the-art methods, highlighting the effectiveness of MSCL-Net in addressing the challenges associated with fine-grained ship classification. Ablation experiments also suggest the effectiveness of our design.