Wind energy is highly volatile, and large-scale wind power grid integration significantly impacts grid stability. Accurate forecasting of wind turbine power can improve wind power consumption and ensure the economy of the power grid. This paper proposes a multistep forecasting method for offshore wind turbine power based on a multi-timescale input and an improved transformer. First, the wind speed sequence is decomposed by the VMD method to extract adequate timing information and remove the noise, after which the decomposition signals are merged with the rest of the timing features, and the dataset is split according to different timescales. A GRU receives the short-timescale inputs, and the Improved Transformer captures the timing relationship of the long-timescale inputs. Finally, a CNN is used to extract the information of each time point at the output of each branch, and the fully connected layer outputs multistep forecasting results. Experiments were conducted on operation data from four wind turbines located within the offshore wind farm but not near the edge. The results show that the proposed method achieved average errors of 0.0522 in MAE, 0.0084 in MSE, and 0.0907 in RMSE on a four-step forecast. This outperformed comparison methods LSTM, CNN-LSTM, LSTM-Attention, and Informer. The proposed method demonstrates superior forecasting performance and accuracy for multistep offshore wind turbine power forecasting.