The performance of bearing fault detection systems based on machine learning techniques largely depends on the selected features. Hence, selection of an ideal number of dominant features from a comprehensive list of features is needed to decrease the number of computations involved in fault detection. In this paper, we attempted statistical time-domain features, namely, Hjorth parameters (activity, mobility, and complexity) and normal negative log likelihood for Gaussian mixture model (GMM) for the first time in addition to 26 other established statistical features for identification of bearing fault type and severity. Two datasets are derived from a publicly available database of Case Western Reserve University to identify the capability of features in fault identification under various fault sizes and motor loads. Features have been investigated using a two-step approach—filter-based ranking with 3 metrics followed by feature subset selection with 11 search techniques. The results indicate that the set of features root mean square, geometric mean, zero crossing rate, Hjorth parameter—mobility, and normal negative log likelihood for GMM outperforms other features. We also compared the diagnostic performance of normal negative log likelihood for GMM with the established feature normal negative log likelihood for single Gaussian. The selected set of statistical features is validated using ensemble rule-based classifiers and showed an average accuracy of 96.75% with proposed statistical features subset and 99.63% with all 30 features. F-measure and G-mean scores are also calculated to investigate their performance on datasets with class imbalance. The diagnostic effectiveness of the features was further validated on a bearing dataset obtained from an operating thermal power plant. The results obtained show that our newly proposed feature subset plays a major role in achieving good classification results and has a future potential of being used in a high-dimensional dataset with multidomain features.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.