Short‐term wind‐power forecasting methods like neural networks are trained by empirical risk minimization. The local optimum and overfitting problem is likely to occur in the model‐training stage, leading to the poor ability of reasoning and generalization in the prediction stage. To solve the problem, a model of short‐term wind power forecasting is proposed based on 2‐stage feature selection and a supervised random forest in the paper. First, in data preprocessing, some redundant features can be removed by a variable importance measure method and intimate samples can be selected based on relevant analysis, so that the efficiency of model training and the correlation degree between input and output samples can be enhanced. Second, an improved supervised random forest (RF) methodology is proposed to compose a new RF based on evaluating the performance of each decision tree and restructuring the decision trees. A new index of external validation in correlation with wind speed in numerical weather prediction has been proposed to overcome the shortcomings of the internal validation index that seriously depends on the training samples. The simulation examples have verified the rationality and feasibility of the improvement. Case studies of measured data from a wind farm have shown that the proposed model has a better performance than the original RF, back propagation neural network, Bayesian network, and support vector machine, in aspects of ensuring accuracy, efficiency, and robustness, and especially if there is high rate of noisy data and wind power curtailment duration in the historical data.