In a mature manufacturing system, the occurrence of operating fault conditions is few and far between. Majority of the data collected from such systems typically exhibits normal operating behaviours. This phenomenon inadvertently creates an imbalance between the class distributions of the data. The imbalance ratio may fall in the range of 1:100 to 1:1000 for every fault condition data available. The nature of such datasets thus makes it harder to build reliable models for accurate fault diagnosis in Condition-Based Maintenance (CBM) due to the lack of learning exemplars of the fault class. Conventional machine learning algorithms do not handle imbalanced datasets well and generally would produce poor classification results. To improve the fault diagnosis reliability on class-imbalanced datasets, this paper proposes a hybrid rebalancing approach called Hybrid Support Vector Machine (SVM) under sampling with Mega Trend Diffusion (MTD) oversampling. Our proposed approach rebalances the dataset by (1) Reducing the amount of normal condition data whilst retaining the most informative ones and (2) Boosting the number of fault condition data to match the size of the normal data. This approach is highly applicable to the manufacturing setting as there is a level of predictability to the nature of data, i.e. data of different fault conditions tend to cluster together in the feature space. Thus, manipulating the data at this level is a logical step. As such, learning effectively with the limited available fault data can translate to significantly costsaving. Our approach is demonstrated and validated with a case study on bearing fault detection. To end, some conclusions and future works are discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.