In machinery fault diagnosis, a large amount of monitoring data is often unlabeled, while the number of labeled data is limited. Therefore, learning effective features from massive unlabeled data is a challenging issue for machinery fault diagnosis. In this paper, a simple unsupervised feature learning method, consistency inference-constrained sparse filtering (CICSF), is proposed to learn mechanical fault features with enhanced clustering performance for fault diagnosis. Firstly, inspired by the data augmentation strategy, consistency inference of latent representations for time series (CILRTS) is derived, which infers that training data instances segmented from the same time series should possess consistent latent feature representations. Then, CILRTS is integrated into sparse filtering (SF) as an additional constraint in the latent feature space. The developed CICSF method can optimize the inter-class sparsity and intra-class similarity of the feature distribution simultaneously. Thus, it can learn more effective features from massive unlabeled data. Finally, based on CICSF, a semi-supervised machinery fault diagnosis method is developed. After unsupervised feature learning by CICSF, a softmax regression classifier is trained with limited labeled data to realize machinery fault diagnosis. Experimental results on bearing and gearbox datasets verify the effectiveness of the proposed method. Moreover, comparisons with standard SF and several auto-encoder (AE) variants validate its superiority in unsupervised feature learning and fault diagnosis using limited labeled data. INDEX TERMS Unsupervised feature learning, machinery fault diagnosis, consistent inference of latent representations for time series, sparse filtering, auto-encoder