To compensate for severe shortage of scrub nurses who support surgeons during surgery, Miyawaki et al. have developed a scrub nurse robot (SNR) system [1,2,3,4,5]. One of its current challenges is how to make the SNR recognize surgical procedures which compose a surgical operation and understand/predict surgeons' intentions.Therefore, in this paper, we propose a visual recognition system for surgeons' actions based on convolutional neural network (CNN). We developed a temporal pose feature (TPF) CNN, which is a method to recognize surgical procedures based on the body movements of a surgeon's stand-in during a simulated surgical operation. We used OpenPose to extract the pose feature vectors from every frame of the short videos filmed our simulated surgery. Besides, we used a matrix of which the pose vectors were chronologically ordered as the input of CNN by considering it as the pseudograyscale image.We show that the TPF CNN was more accurate in the objects of this study than the conventional LSTM, which is used to recognize time series data. The TPF CNN shows higher recognition accuracy with fewer training than LSTM. Our results suggest that surgeons' body movements may contain much information to be required for recognizing subtle differences in several types of surgical procedures.