With an increasing number of urban vehicles and complex road environments, real-time vehicle detection has become a key technology in autonomous driving, but it faces many challenges. Although traditional two-step target detection algorithms (such as the R-CNN series) have high detection accuracy, their real-time performance is poor, which makes it difficult to meet the needs of vehicle detection. In contrast, one-step detection algorithms such as YOLO stand out for their high speed and higher accuracy. However, the real-time detection of the latest YOLOv9 model in urban vehicle scenarios still needs to be improved. Therefore, this paper have improved the YOLOv9 model, specifically introducing the SENetV1 attention mechanism into the backbone extraction network. The experimental results show that the mAP value of the improved algorithm in vehicle detection is promoted by 5% under the same training conditions. Such an improvement not only enhances the ability to capture relationships between channels, but also improves feature expression capabilities and expands the application of YOLOv9 in autonomous driving.