The traditional model for wind turbine fault prediction is not sensitive to the time sequence data and cannot mine the deep connection between the time series data, resulting in poor generalization ability of the model. To solve this problem, this paper proposes an attention mechanism-based CNN-LSTM model. The semantic sensor data annotated by SSN ontology is used as input data. Firstly, CNN extracts features to get high-level feature representation from input data. Then, the latent time sequence connection of features in different time periods is learned by LSTM. Finally, the output of LSTM is input into the attention mechanism module to obtain more fault-related target information, which improves the efficiency, accuracy, and generalization ability of the model. In addition, in the data preprocessing stage, the random forest algorithm analyzes the feature correlation degree of the data to get the features of high correlation degree with the wind turbine fault, which further improves the efficiency, accuracy, and generalization ability of the model. The model is validated on the icing fault dataset of No. 21 wind turbine and the yaw dataset of No. 4 wind turbine. The experimental results show that the proposed model has better efficiency, accuracy, and generalization ability than RNN, LSTM, and XGBoost.