Single sensors often fail to meet the needs of practical applications due to their lack of robustness and poor detection accuracy in harsh weather and complex environments. A vehicle detection method based on the fusion of millimeter wave (mmWave) radar and monocular vision was proposed to solve this problem in this paper. The method successfully combines the benefits of mmWave radar for measuring distance and speed with the vision for classifying objects. Firstly, the raw point cloud data of mmWave radar can be processed by the proposed data pre-processing algorithm to obtain 3D detection points with higher confidence. Next, the density-based spatial clustering of applications with noise (DBSCAN) clustering fusion algorithm and the nearest neighbor algorithm were also used to correlate the same frame data and adjacent frame data, respectively. Then, the effective targets from mmWave radar and vision were matched under temporal-spatio alignment. In addition, the successfully matched targets were output by using the Kalman weighted fusion algorithm. Targets that were not successfully matched were marked as new targets for tracking and handled in a valid cycle. Finally, experiments demonstrated that the proposed method can improve target localization and detection accuracy, reduce missed detection occurrences, and efficiently fuse the data from the two sensors.