Abstract:The continual accumulation of power grid failure logs provides a valuable but rarely used source for data mining. Sequential analysis, aiming at exploiting the temporal evolution and exploring the future trend in power grid failures, is an increasingly promising alternative for predictive scheduling and decision-making. In this paper, a temporal Latent Dirichlet Allocation (TLDA) framework is proposed to proactively reduce the cardinality of the event categories and estimate the future failure distributions by automatically uncovering the hidden patterns. The aim was to model the failure sequence as a mixture of several failure patterns, each of which was characterized by an infinite mixture of failures with certain probabilities. This state space dependency was captured by a hierarchical Bayesian framework. The model was temporally extended by establishing the long-term dependency with new co-occurrence patterns. Evaluation of the high voltage circuit breakers (HVCBs) demonstrated that the TLDA model had higher fidelities of 51.13%, 73.86%, and 92.93% in the Top-1, Top-5, and Top-10 failure prediction tasks over the baselines, respectively. In addition to the quantitative results, we showed that the TLDA can be successfully used for extracting the time-varying failure patterns and capture the failure association with a cluster coalition method.