In this paper, by analyzing the financial data of small- and medium-sized enterprises, it is found that there is a general problem of imbalance data. Therefore, an effective financial distress prediction model based on clustering under-sampling and LightGBM model is constructed. Based on the idea of ensemble learning, this paper proposes the CUS-LightGBM (cluster-based under-sampling with LightGBM) model. First, the data are divided into minority and majority class samples. Then the K-means algorithm is used to cluster the majority class samples, and some data are selected from each cluster to form the balanced data. Finally, it is fused with the LightGBM algorithm based on decision tree to form an efficient prediction model. In addition, there are a large number of redundant features in the proposed model, which will reduce the prediction accuracy and efficiency of the model. Therefore, this paper adopts the feature selection based on ensemble strategy, and determines the main risk factors according to the principle of minority obeying majority. Finally, through the experimental analysis of real financial data, the results show that the CUS-LightGBM model can significantly improve the recognition ability of small- and medium-sized enterprises in financial distress, and the proposed model is more effective in processing financial ratio data than the benchmark model.