Wind power is an essential component of renewable energy. It enables the conservation of conventional energy sources such as coal and oil while reducing greenhouse gas emissions. To address the stochastic and intermittent nature of ultra-short-term wind power, a combined prediction model based on variational mode decomposition (VMD) and gradient boosting regression tree (GBRT) is proposed. Firstly, VMD is utilized to decompose the original wind power signal into three meaningful components: the long-term component, the short-term component, and the randomness component. Secondly, based on the characteristics of these three components, a support vector machine (SVM) is selected to predict the long-term and short-term components, while gated recurrent unit-long short-term memory (GRU-LSTM) is employed to predict the randomness component. Particle swarm optimization (PSO) is utilized to optimize the structural parameters of the SVM and GRU-LSTM combination for enhanced prediction accuracy. Additionally, a GBRT model is employed to predict the residuals. Finally, the rolling predicted values of the three components and residuals are aggregated. A deep learning framework using TensorFlow 2.0 has been built on the Python platform, and a dataset measured from a wind farm has been utilized for learning and prediction. The comparative analysis reveals that the proposed model exhibits superior short-term wind power prediction performance, with a mean squared error, mean absolute error, and coefficient of determination of 0.0244, 0.1185, and 0.9821, respectively.