The prediction of audit opinions of listed companies plays a significant role in the security market risk prevention. By introducing machine learning methods, many innovations can be implemented to improve audit quality, lift audit efficiency, and cultivate the keen insight of auditors. However, in a realistic environment, category imbalance and critical feature selection exist in the prediction model of company audit opinions. This paper firstly combines batched sparse principal component analysis (BSPCA) with kernel fuzzy clustering algorithm (KFCM) and proposes a sparse-kernel fuzzy clustering undersampling method (S-KFCM) to deal with the imbalance of sample categories. This method adopts the kernel fuzzy clustering algorithm to down-sample the normal samples, and their features are extracted from abnormal sample sets based on the group sparse component method. The sparse normal sample set can maintain the original distribution space structure and highlight the classification boundary samples. Secondly, considering the company’s characteristic attributes and data sources, 448 original variables are grouped, and then BSPCA is used for feature screening. Finally, the support vector machine (SVM) is adopted to complete the classification prediction. According to the empirical results, the SKFCM-SVM model has the highest prediction accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.