The existing research on deep learning for radar signal intra–pulse modulation classification is mainly based on supervised leaning techniques, which performance mainly relies on a large number of labeled samples. To overcome this limitation, a self–supervised leaning framework, contrastive learning (CL), combined with the convolutional neural network (CNN) and focal loss function is proposed, called CL––CNN. A two–stage training strategy is adopted by CL–CNN. In the first stage, the model is pretrained using abundant unlabeled time–frequency images, and data augmentation is used to introduce positive–pair and negative–pair samples for self–supervised learning. In the second stage, the pretrained model is fine–tuned for classification, which only uses a small number of labeled time–frequency images. The simulation results demonstrate that CL–CNN outperforms the other deep models and traditional methods in scenarios with Gaussian noise and impulsive noise–affected signals, respectively. In addition, the proposed CL–CNN also shows good generalization ability, i.e., the model pretrained with Gaussian noise–affected samples also performs well on impulsive noise–affected samples.