Intelligent fault diagnosis is a promising tool to deal with industrial big data due to its ability in rapidly and efficiently processing collected signals and providing accurate diagnosis results. In traditional static intelligent diagnosis methods, however, the correlation between sequential data is neglected, and the features of raw data cannot be effectively extracted. Therefore, this paper proposes a three-stage fault diagnosis method based on a gate recurrent unit (GRU) network. The raw data is divided into several sequence units by first using a moving horizon as the input of GRU. In this way, we can intercept the sequence to get information as needed. Then, the GRU deep network is established through batch normalization (BN) algorithm to extract the dynamic feature from the sequence units effectively. Finally, the softmax regression is employed to classify faults based on dynamic features. Thus, the diagnosis result is obtained with a probabilistic explanation. Two chemical processes validate the proposed method: Tennessee Eastman (TE) benchmark process as well as para-xylene (PX) oxidation process. In the case of TE, the diagnosis results demonstrate the proposed method is superior to conventional methods. Furthermore, in the case of PX oxidation, the result shows that the proposed method also has an exceptional effect with a little historical data.