Oversampling is an efficient technique in dealing with class-imbalance problem. It addresses the problem by reduplicating or generating the minority class samples to balance the distribution between the samples of the majority and the minority class. Synthetic minority oversampling technique (SMOTE) is one of the typical representatives. During the past decade, researchers have proposed many variants of SMOTE. However, the existing oversampling methods may generate wrong minority class samples in some scenarios. Furthermore, how to effectively mine the inherent complex characteristics of imbalanced data remains a challenge. To this end, this paper proposes a parameter-free data cleaning method to improve SMOTE based on constructive covering algorithm. The dataset generated by SMOTE is first partitioned into a group of covers, then the hard-to-learn samples can be detected based on the characteristics of sample space distribution. Finally, a pair-wise deletion strategy is proposed to remove the hard-to-learn samples. The experimental results on 25 imbalanced datasets show that our proposed method is superior to the comparison methods in terms of various metrics, such as F-measure, G-mean, and Recall. Our method not only can reduce the complexity of the dataset but also can improve the performance of the classification model.
INDEX TERMSImbalanced data, SMOTE, oversampling, constructive covering algorithm, data cleaning.
Algorithms for wavefront sensing and error correction from intensity attract great concern in many fields. Here we propose Bayesian optimization to retrieve phase and demonstrate its performance in simulation and experiment. For small aberration, this method demonstrates a convergence process with high accuracy of phase sensing, which is also verified experimentally. For large aberration, Bayesian optimization is shown to be insensitive to the initial phase while maintaining high accuracy. The approach’s merits of high accuracy and robustness make it promising in being applied in optical systems with static aberration such as AMO experiments, optical testing shops, and electron or optical microscopes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.