The operating process of complex systems usually manifest in multiple distinct operating modes. In the case of a wind turbine, for example, its operating mode is highly influenced by the wind condition, which changes dynamically in natural environment. The SCADA system plays a crucial role in collecting various parameters from wind turbines, facilitating the differentiation, and modeling of distinct operating modes. However, the challenge lies in the excessive dimensionality of variables in SCADA data, making modeling efforts both intricate and inefficient. In this study, we leverage the engineering knowledge on the hierarchical structure of the variables in wind turbine, and propose a novel method to efficiently cluster the data temporally by operating modes. Our methodology involves initially clustering variables according to subsystems and implementing temporal clustering within each subsystem. Subsequently, we introduce a novel graph neural network to extract and concatenate features from all subsystems, enabling the discrimination of the operational mode of the entire system. Finally, we model these features to make predictions of the output power, and the prediction residual can be used for monitoring. Performance evaluations on both numerical experiments and real‐world wind turbine datasets attest to the effectiveness and superiority of the proposed methods.