The development of medical image classification models necessitates a substantial number of labeled images for model training. In real-world scenarios, sample sizes are typically limited and labeled samples often constitute only a small portion of the dataset. This paper aims to investigate a collaborative similarity learning strategy that optimizes pseudo-labels to enhance model accuracy and expedite its convergence, known as the joint similarity learning framework. By integrating semantic similarity and instance similarity, the pseudo-labels are mutually refined to ensure their quality during initial training. Furthermore, the similarity score is utilized as a weight to guide samples away from misclassification predictions during the classification process. To enhance the model’s generalization ability, an adaptive consistency constraint is introduced into the loss function to improve performance on untrained datasets. The model achieved a satisfactory accuracy of 93.65% at 80% labeling ratio, comparable to supervised learning methods’ performance. Even with very low labeling ratio (e.g., 5%), the model still attained an accuracy of 74.28%. Comparison with other techniques such as Mean Teacher and FixMatch revealed that our approach significantly outperforms them in medical image classification tasks through improving accuracy by approximately 2%, demonstrating this framework’s leadership in medical image classification.