Decision-level information fusion methods using radar and vision usually suffer from low target matching success rates and imprecise multi-target detection accuracy. Therefore, a robust target detection algorithm based on the fusion of frequency-modulated continuous wave (FMCW) radar and a monocular camera is proposed to address these issues in this paper. Firstly, a lane detection algorithm is used to process the image to obtain lane information. Then, two-dimensional fast Fourier transform (2D-FFT), constant false alarm rate (CFAR), and density-based spatial clustering of applications with noise (DBSCAN) are used to process the radar data. Furthermore, the YOLOv5 algorithm is used to process the image. In addition, the lane lines are utilized to filter out the interference targets from outside lanes. Finally, multi-sensor information fusion is performed for targets in the same lane. Experiments show that the balanced score of the proposed algorithm can reach 0.98, which indicates that it has low false and missed detections. Additionally, the balanced score is almost unchanged in different environments, proving that the algorithm is robust.