It is rare to use the one-stage model without segmentation for the automatic detection of coronary lesions. This study sequentially enrolled 200 patients with significant stenoses and occlusions of the right coronary and categorized their angiography images into two angle views: The CRA (cranial) view of 98 patients with 2453 images and the LAO (left anterior oblique) view of 176 patients with 3338 images. Randomization was performed at the patient level to the training set and test set using a 7:3 ratio. YOLOv5 was adopted as the key model for direct detection. Four types of lesions were studied: Local Stenosis (LS), Diffuse Stenosis (DS), Bifurcation Stenosis (BS), and Chronic Total Occlusion (CTO). At the image level, the precision, recall, mAP@0.1, and mAP@0.5 predicted by the model were 0.64, 0.68, 0.66, and 0.49 in the CRA view and 0.68, 0.73, 0.70, and 0.56 in the LAO view, respectively. At the patient level, the precision, recall, and F1scores predicted by the model were 0.52, 0.91, and 0.65 in the CRA view and 0.50, 0.94, and 0.64 in the LAO view, respectively. YOLOv5 performed the best for lesions of CTO and LS at both the image level and the patient level. In conclusion, the one-stage model without segmentation as YOLOv5 is feasible to be used in automatic coronary lesion detection, with the most suitable types of lesions as LS and CTO.