SummaryFeature selection is a useful method for fulfilling the data classification since the inherent heterogeneity of data and the redundancy of features are often encountered in the current data exploding era. Some commonly used feature selection algorithms, which include but are not limited to Pearson, maximal information coefficient, and ReliefF, are well‐posed under the assumption that instances are distributed homogenously in datasets. However, such an assumption might be not true in the practice. As such, in the presence of data imbalance, these traditional feature selection algorithms might be invalid due to their prejudices to the minority class, which includes few samples. The purpose of the addressed problem in this article is to develop an effective feature selection algorithm for imbalanced judicial datasets, which is capable of extracting essential features while deleting negligible ones according to the practical feature requirements. To achieve this goal, the number and the distribution of samples in each class are fully taken into consideration for the correlation analysis. Compared with the traditional feature selection algorithms, the proposed improved ReliefF algorithm is equipped with: (i) different weights of features according to the characteristics of heterogeneous samples in different classes; (ii) justice for imbalanced datasets; and (iii) threshold constraints resulting from the practical feature requirements. Finally, experiments on a judicial dataset and six public datasets well illustrate the effectiveness and the superiority of the proposed feature selection algorithm in improving the classification accuracy for imbalanced datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.