In the field of fault diagnosis, machine learning is highly valued for its broad applicability and efficiency. Feature extraction and feature selection are key steps in the application of machine learning, and the performance of fault diagnosis methods relies heavily on the effective execution of these two steps. For this reason, this paper aims to enhance the performance of fault diagnosis methods by improving these two aspects. Firstly, to address the non-linearity and non-stationarity of rotating machinery vibration signals under variable operation conditions, this paper proposes an improved rapid refined composite multiscale sample entropy (IR2CMSE) feature extraction method. In addition, this paper decomposes the vibration signals with improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) and extracts the sensitive intrinsic modal functions’ (IMFs') IR2CMSE values (SI-IR2CMSE) as the initial feature vector, which more accurately reveals the intrinsic time-scale characteristics of the vibration signals. Secondly, to address the problem of over-reliance on sample labels in most feature selection methods, this paper proposes a semi-supervised Gaussian mixing model with sparse regularization (SSGMM-SR) feature selection model. The model does not require complete fault labels and can automatically identify important features. Finally, validation with two rotating machinery fault datasets shows that the method proposed in this study exhibits high diagnostic accuracy and stability across multiple classifiers.