Abstract:The estimation of losses of distribution feeders plays a crucial guiding role for the planning, design, and operation of a distribution system. This paper proposes a novel estimation method of statistical line loss of distribution feeders using the feeder cluster technique and modified eXtreme Gradient Boosting (XGBoost) algorithm that is based on the characteristic data of feeders that are collected in the smart power distribution and utilization system. In order to enhance the applicability and accuracy of the estimation model, k-medoids algorithm with weighting distance for clustering distribution feeders is proposed. Meanwhile, a variable selection method for clustering distribution feeders is discussed, considering the correlation and validity of variables. This paper next modifies the XGBoost algorithm by adding a penalty function in consideration of the effect of the theoretical value to the loss function for the estimation of statistical line loss of distribution feeders. The validity of the proposed methodology is verified by 762 distribution feeders in the Shanghai distribution system. The results show that the XGBoost method has higher accuracy than decision tree, neural network, and random forests by comparison of Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), and Absolute Percentage Error (APE) indexes. In particular, the theoretical value can significantly improve the reasonability of estimated results.