To prevent needlestick injury and leftover instruments, and to perform efficient dental treatment, it is important to know the instruments required during dental treatment. Therefore, we will obtain a dataset for image recognition of dental treatment instruments, develop a system for detecting dental treatment instruments during treatment by image recognition, and evaluate the performance of the system to establish a method for detecting instruments during treatment. We created an image recognition dataset using 23 types of instruments commonly used in the Department of Restorative Dentistry and Endodontology at Osaka University Dental Hospital and a surgeon’s hands as detection targets. Two types of datasets were created: one annotated with only the characteristic parts of the instruments, and the other annotated with the entire parts of instruments. YOLOv4 and YOLOv7 were used as the image recognition system. The performance of the system was evaluated in terms of two metrics: detection accuracy (DA), which indicates the probability of correctly detecting the number of target instruments in an image, and the average precision (AP). When using YOLOv4, the mean DA and AP were 89.3% and 70.9%, respectively, when the characteristic parts of the instruments were annotated and 85.3% and 59.9%, respectively, when the entire parts of the instruments were annotated. When using YOLOv7, the mean DA and AP were 89.7% and 80.8%, respectively, when the characteristic parts of the instruments were annotated and 84.4% and 63.5%, respectively, when the entire parts of the instruments were annotated. The detection of dental instruments can be performed efficiently by targeting the parts characterizing them.