Human-machine co-driving is an important stage in the development of automatic driving, and accurate recognition of driver behavior is the basis for realizing human-machine co-driving. However, traditional detection methods exhibit limitations in driver behavior detection, including low accuracy and slow processing efficiency. Aiming at these challenges, this paper proposes a driver behavior detection method that improves the Swin transformer model. First, the efficient channel attention (ECA) module is added after the self-attention mechanism of the Swin transformer model so that the channel features can be dynamically adjusted according to their importance, thus enhancing the model’s attention to the important channel features. Then, the image preprocessing of the public State Farm dataset and expansion of the original image dataset is carried out. Then, the parameters of the model are tuned. Finally, through the comparison test with other models, an ablation test is performed to verify the performance of the proposed model. The results show that the proposed model algorithm has a better performance in 10 classifications of driver behavior detection, with an accuracy of 99.42%, which is improved by 3.8% and 1.68% compared to Vgg16 and MobileNetV2, respectively. It can provide a theoretical reference for the development of an intelligent automobile human-machine co-driving system.