With the development of semiconductor processing technology, the components within the wafers are highly integrated and complex, which can increase the number of surface defects. Traditional methods of wafer defect detection which were done manually by experienced engineers using computer-aided tools have some disadvantages, such as small defective area miss detection, low efficiency and tend to cause secondary contamination. Therefore, it is essential to identify dies defects quickly and locate defects timely, which can efficiently improve the yield of wafers. A novel wafer defect detection method based on clustering-template matching combined with improved YOLOv5 network (CTM-IYOLOv5) has been proposed in this paper. An inspection platform is used to acquire images of wafers in real-time. And the pre-processing operations are used for the preliminary processing of the acquired grain maps, which can reduce the computation of image processing and improve the efficiency of inspection. To solve the phenomenon of false detection and missed detection caused by multiple inter-grain channels in the field of view, clustering-template matching is used to segment the grains in the field of view. In addition, the convolutional layer structure, network structure, and data enhancement of YOLOv5 have been improved to become lighter, more accurate, and more suitable for small-target detection, with the model size been effectively reduced by 16%. The experimental results show that the accuracy of the proposed method is up to 99%, and the efficiency improvement of 44%, which can realize the need for real-time detection of wafer defects.