“…The combined base model has four performance indicators: Recall, MAP@100, F 1 -score, and AUC under the new feature set Y after preprocessing, feature extraction and dimensionality reduction of the SGCC dataset, at this time, the meta-model of the stacking structure chooses a relatively simple linear regression (LR) model [38]. According to the classifier, selection of the base model layer, as in Section 3.3, should be strong and numerous, so the performance index values of eight existing classifiers commonly used for electricity theft detection under the new feature set Y are compared, and the eight classifiers are: random forest (RF) [39], eXtreme gradient boosting (XGBoost) [25], light gradient boosting machine (LightGBM) [40], support vector machine (SVM) [22], CART decision tree (DT) [23], deep forest (DF) [41], long short-term memory (LSTM) [28], and K-nearest neighbor (KNN) [42].…”