Metro train operation may result in massive and complex unstructured fault text data. To solve the problem of low classification accuracy and incomplete classification effect of unstructured fault data automatic classification task, a BERT-BiGRU fault text classification model based on key layer fusion is proposed. Firstly, the unstructured text data is processed into word vectors with location information in the word embedding layer and then input into the BERT layer. Based on the traditional 12-layer BERT model, the semantic information is fully obtained by encoding the two-way transformer encoder in layers 2, 4, 6, 8, and 12 for fusion and dimensionality reduction, which is then input into the BiGRU layer to extract the context information to obtain the high-level feature representation of the text. After that, the final classification results are output in the output layer through the full connection layer FC and softmax functions. This model is tested with other mainstream models on the fault text data of metro on-board equipment. The experiment results show that on the same data set, the
F
1
-score of this model is 7%~8% higher than that of the current mainstream classification model, that is, about 1~2% higher than that of other PLMs, and the
F
1
-score of this model is higher than that of BERT model with different transformer layers, reaching the highest value of 0.9272, and the convergence speed in the training process is fast.