To standardize driver behaviour and enhance transportation system safety, a dynamic driver behaviour recognition method based on the Recurrent All‐Pairs Field Transforms (RAFT) temporal model is proposed. This study involves the creation of two datasets, namely, Driver‐img and Driver‐vid, including driver behaviour images and videos across various scenarios. These datasets are subject to preprocessing using RAFT optical flow techniques to enhance the cognitive process of the network. This approach employs a two‐stage temporal model for driver behaviour recognition. In the initial stage, the MobileNet network is optimized and the GYY module is introduced, which includes residuals and global average pooling layers, thereby enhancing the network's feature extraction capabilities. In the subsequent stage, a bidirectional GRU network is constructed to learn driver behaviour video features with temporal information. Additionally, a method for compressing and padding video frames is proposed, which serves as input to the GRU network and enables intent prediction 0.2 s prior to driver actions. Model performance is assessed through accuracy, recall, and F1 score, with experimental results indicating that RAFT preprocessing enhances accuracy, reduces training time, and improves overall model stability, facilitating the recognition of driver behaviour intent.