Self-supervised learning for semi-supervised facial expression recognition aims to avoid the need to collect expensive labeled facial expression data. Existing methods demonstrate an impressive performance boost, but they artificially assume that small labeled facial expression data and large unlabeled facial expression data are from the same class distribution. In a more realistic scenario, when utilizing facial expression data from a large face recognition database as unlabeled data, there will be a mismatch distribution between the two sets of data. This often results in severe performance degradation due to incorrect propagation of unlabeled data from unrelated sources. In this paper, we address the class distribution mismatch problem in deep semi-supervised learning-based facial expression recognition. Specifically, we propose a silhouette coefficient-based contrast clustering algorithm, which determines the degree of separation between clusters by examining the intra-cluster and inter-cluster distances to accurately detect out-of-distribution data. In addition, we propose a pseudo-labeling rethinking strategy that matches the soft pseudo-labels estimated from a fine-tuned network to the contrast clustering to produce reliable pseudolabels. Experiments on three in-the-wild datasets, RAF-DB, FERPlus and AffectNet, demonstrate the effectiveness of our method, and our approach performs well compared to state-of-the-art methods.INDEX TERMS Facial expression recognition, semi-supervised learning, contrastive self-supervised learning, out-of-distribution data.